Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 53
Filter
1.
Genome Biol ; 25(1): 145, 2024 06 03.
Article in English | MEDLINE | ID: mdl-38831386

ABSTRACT

BACKGROUND: Single-cell RNA sequencing (scRNA-seq) and spatially resolved transcriptomics (SRT) have led to groundbreaking advancements in life sciences. To develop bioinformatics tools for scRNA-seq and SRT data and perform unbiased benchmarks, data simulation has been widely adopted by providing explicit ground truth and generating customized datasets. However, the performance of simulation methods under multiple scenarios has not been comprehensively assessed, making it challenging to choose suitable methods without practical guidelines. RESULTS: We systematically evaluated 49 simulation methods developed for scRNA-seq and/or SRT data in terms of accuracy, functionality, scalability, and usability using 152 reference datasets derived from 24 platforms. SRTsim, scDesign3, ZINB-WaVE, and scDesign2 have the best accuracy performance across various platforms. Unexpectedly, some methods tailored to scRNA-seq data have potential compatibility for simulating SRT data. Lun, SPARSim, and scDesign3-tree outperform other methods under corresponding simulation scenarios. Phenopath, Lun, Simple, and MFA yield high scalability scores but they cannot generate realistic simulated data. Users should consider the trade-offs between method accuracy and scalability (or functionality) when making decisions. Additionally, execution errors are mainly caused by failed parameter estimations and appearance of missing or infinite values in calculations. We provide practical guidelines for method selection, a standard pipeline Simpipe ( https://github.com/duohongrui/simpipe ; https://doi.org/10.5281/zenodo.11178409 ), and an online tool Simsite ( https://www.ciblab.net/software/simshiny/ ) for data simulation. CONCLUSIONS: No method performs best on all criteria, thus a good-yet-not-the-best method is recommended if it solves problems effectively and reasonably. Our comprehensive work provides crucial insights for developers on modeling gene expression data and fosters the simulation process for users.


Subject(s)
Gene Expression Profiling , Single-Cell Analysis , Single-Cell Analysis/methods , Gene Expression Profiling/methods , Humans , Software , Computer Simulation , Transcriptome , Computational Biology/methods , Sequence Analysis, RNA/methods , RNA-Seq/methods , RNA-Seq/standards
2.
BMC Genomics ; 25(1): 444, 2024 May 06.
Article in English | MEDLINE | ID: mdl-38711017

ABSTRACT

BACKGROUND: Normalization is a critical step in the analysis of single-cell RNA-sequencing (scRNA-seq) datasets. Its main goal is to make gene counts comparable within and between cells. To do so, normalization methods must account for technical and biological variability. Numerous normalization methods have been developed addressing different sources of dispersion and making specific assumptions about the count data. MAIN BODY: The selection of a normalization method has a direct impact on downstream analysis, for example differential gene expression and cluster identification. Thus, the objective of this review is to guide the reader in making an informed decision on the most appropriate normalization method to use. To this aim, we first give an overview of the different single cell sequencing platforms and methods commonly used including isolation and library preparation protocols. Next, we discuss the inherent sources of variability of scRNA-seq datasets. We describe the categories of normalization methods and include examples of each. We also delineate imputation and batch-effect correction methods. Furthermore, we describe data-driven metrics commonly used to evaluate the performance of normalization methods. We also discuss common scRNA-seq methods and toolkits used for integrated data analysis. CONCLUSIONS: According to the correction performed, normalization methods can be broadly classified as within and between-sample algorithms. Moreover, with respect to the mathematical model used, normalization methods can further be classified into: global scaling methods, generalized linear models, mixed methods, and machine learning-based methods. Each of these methods depict pros and cons and make different statistical assumptions. However, there is no better performing normalization method. Instead, metrics such as silhouette width, K-nearest neighbor batch-effect test, or Highly Variable Genes are recommended to assess the performance of normalization methods.


Subject(s)
Single-Cell Analysis , Animals , Humans , Algorithms , Gene Expression Profiling/methods , Gene Expression Profiling/standards , RNA-Seq/methods , RNA-Seq/standards , Sequence Analysis, RNA/methods , Single-Cell Analysis/methods , Transcriptome , Datasets as Topic
3.
J Biol Chem ; 299(6): 104810, 2023 06.
Article in English | MEDLINE | ID: mdl-37172729

ABSTRACT

RNA sequencing (RNA-seq) is a powerful technique for understanding cellular state and dynamics. However, comprehensive transcriptomic characterization of multiple RNA-seq datasets is laborious without bioinformatics training and skills. To remove the barriers to sequence data analysis in the research community, we have developed "RNAseqChef" (RNA-seq data controller highlighting expression features), a web-based platform of systematic transcriptome analysis that can automatically detect, integrate, and visualize differentially expressed genes and their biological functions. To validate its versatile performance, we examined the pharmacological action of sulforaphane (SFN), a natural isothiocyanate, on various types of cells and mouse tissues using multiple datasets in vitro and in vivo. Notably, SFN treatment upregulated the ATF6-mediated unfolded protein response in the liver and the NRF2-mediated antioxidant response in the skeletal muscle of diet-induced obese mice. In contrast, the commonly downregulated pathways included collagen synthesis and circadian rhythms in the tissues tested. On the server of RNAseqChef, we simply evaluated and visualized all analyzing data and discovered the NRF2-independent action of SFN. Collectively, RNAseqChef provides an easy-to-use open resource that identifies context-dependent transcriptomic features and standardizes data assessment.


Subject(s)
Gene Expression Profiling , Internet , Isothiocyanates , RNA-Seq , Software , Sulfoxides , Animals , Mice , Gene Expression Profiling/methods , Gene Expression Profiling/standards , Isothiocyanates/pharmacology , Sulfoxides/pharmacology , RNA-Seq/methods , RNA-Seq/standards , Organ Specificity/drug effects , Reproducibility of Results , Mice, Obese , Unfolded Protein Response/drug effects , Liver/drug effects , Muscle, Skeletal/drug effects , Antioxidants/metabolism , Data Visualization
4.
Nature ; 608(7924): 733-740, 2022 08.
Article in English | MEDLINE | ID: mdl-35978187

ABSTRACT

Single-cell transcriptomics (scRNA-seq) has greatly advanced our ability to characterize cellular heterogeneity1. However, scRNA-seq requires lysing cells, which impedes further molecular or functional analyses on the same cells. Here, we established Live-seq, a single-cell transcriptome profiling approach that preserves cell viability during RNA extraction using fluidic force microscopy2,3, thus allowing to couple a cell's ground-state transcriptome to its downstream molecular or phenotypic behaviour. To benchmark Live-seq, we used cell growth, functional responses and whole-cell transcriptome read-outs to demonstrate that Live-seq can accurately stratify diverse cell types and states without inducing major cellular perturbations. As a proof of concept, we show that Live-seq can be used to directly map a cell's trajectory by sequentially profiling the transcriptomes of individual macrophages before and after lipopolysaccharide (LPS) stimulation, and of adipose stromal cells pre- and post-differentiation. In addition, we demonstrate that Live-seq can function as a transcriptomic recorder by preregistering the transcriptomes of individual macrophages that were subsequently monitored by time-lapse imaging after LPS exposure. This enabled the unsupervised, genome-wide ranking of genes on the basis of their ability to affect macrophage LPS response heterogeneity, revealing basal Nfkbia expression level and cell cycle state as important phenotypic determinants, which we experimentally validated. Thus, Live-seq can address a broad range of biological questions by transforming scRNA-seq from an end-point to a temporal analysis approach.


Subject(s)
Cell Survival , Gene Expression Profiling , Macrophages , RNA-Seq , Single-Cell Analysis , Transcriptome , Adipose Tissue/cytology , Cell Cycle/drug effects , Cell Cycle/genetics , Cell Differentiation , Gene Expression Profiling/methods , Gene Expression Profiling/standards , Genome/drug effects , Genome/genetics , Lipopolysaccharides/immunology , Lipopolysaccharides/pharmacology , Macrophages/cytology , Macrophages/drug effects , Macrophages/immunology , Macrophages/metabolism , NF-KappaB Inhibitor alpha/genetics , Organ Specificity , Phenotype , RNA/genetics , RNA/isolation & purification , RNA-Seq/methods , RNA-Seq/standards , Reproducibility of Results , Sequence Analysis, RNA/methods , Sequence Analysis, RNA/standards , Single-Cell Analysis/methods , Stromal Cells/cytology , Stromal Cells/metabolism , Time Factors , Transcriptome/genetics
5.
Sci Rep ; 12(1): 1789, 2022 02 02.
Article in English | MEDLINE | ID: mdl-35110572

ABSTRACT

Despite the recent precipitous decline in the cost of genome sequencing, library preparation for RNA-seq is still laborious and expensive for applications such as high throughput screening. Limited availability of RNA generated by some experimental workflows poses an additional challenge and increases the cost of RNA library preparation. In a search for low cost, automation-compatible RNA library preparation kits that maintain strand specificity and are amenable to low input RNA quantities, we systematically tested two recent commercial technologies-Swift RNA and Swift Rapid RNA, presently offered by Integrated DNA Technologies (IDT) -alongside the Illumina TruSeq stranded mRNA, the de facto standard workflow for bulk transcriptomics. We used the Universal Human Reference RNA (UHRR) (composed of equal quantities of total RNA from 10 human cancer cell lines) to benchmark gene expression in these kits, at input quantities ranging between 10 to 500 ng. We found normalized read counts between all treatment groups to be in high agreement. Compared to the Illumina TruSeq stranded mRNA kit, both Swift RNA library kits offer shorter workflow times enabled by their patented Adaptase technology. We also found the Swift RNA kit to produce the fewest number of differentially expressed genes and pathways directly attributable to input mRNA amount.


Subject(s)
Biomarkers, Tumor/genetics , Gene Library , Neoplasms/genetics , RNA, Neoplasm/analysis , RNA-Seq/methods , RNA-Seq/standards , Transcriptome , Gene Expression Profiling , Humans , Neoplasms/pathology , RNA, Neoplasm/genetics , Sequence Analysis, RNA/methods , Tumor Cells, Cultured
6.
Neurosci Lett ; 771: 136468, 2022 02 06.
Article in English | MEDLINE | ID: mdl-35065247

ABSTRACT

Recent RNA-seq studies have generated a new crop of putative gene markers for terminal Schwann cells (tSCs), non-myelinating glia that cap axon terminals at the vertebrate neuromuscular junction (NMJ). While compelling, these studies did not validate the expression of the novel markers using in situ hybridization techniques. Here, we use RNAscope technology to study the expression of top candidates from recent tSC and non-myelinating Schwann cell marker RNA-seq studies. Our results validate the expression of these markers at tSCs but also demonstrate that they are present at other sites in the muscle tissue, specifically, at muscle spindles and along intramuscular nerves.


Subject(s)
Nerve Tissue Proteins/genetics , RNA-Seq/methods , Schwann Cells/metabolism , Animals , Female , In Situ Hybridization, Fluorescence/methods , In Situ Hybridization, Fluorescence/standards , Male , Mice , Mice, Inbred C57BL , Nerve Tissue Proteins/metabolism , Neuromuscular Junction/metabolism , RNA-Seq/standards , Reference Standards
7.
Sci Rep ; 12(1): 380, 2022 01 10.
Article in English | MEDLINE | ID: mdl-35013473

ABSTRACT

Epigenetic modifications are crucial for normal development and implicated in disease pathogenesis. While epigenetics continues to be a burgeoning research area in neuroscience, unaddressed issues related to data reproducibility across laboratories remain. Separating meaningful experimental changes from background variability is a challenge in epigenomic studies. Here we show that seemingly minor experimental variations, even under normal baseline conditions, can have a significant impact on epigenome outcome measures and data interpretation. We examined genome-wide DNA methylation and gene expression profiles of hippocampal tissues from wild-type rats housed in three independent laboratories using nearly identical conditions. Reduced-representation bisulfite sequencing and RNA-seq respectively identified 3852 differentially methylated and 1075 differentially expressed genes between laboratories, even in the absence of experimental intervention. Difficult-to-match factors such as animal vendors and a subset of husbandry and tissue extraction procedures produced quantifiable variations between wild-type animals across the three laboratories. Our study demonstrates that seemingly minor experimental variations, even under normal baseline conditions, can have a significant impact on epigenome outcome measures and data interpretation. This is particularly meaningful for neurological studies in animal models, in which baseline parameters between experimental groups are difficult to control. To enhance scientific rigor, we conclude that strict adherence to protocols is necessary for the execution and interpretation of epigenetic studies and that protocol-sensitive epigenetic changes, amongst naive animals, may confound experimental results.


Subject(s)
DNA Methylation , Epigenesis, Genetic , Epigenome , Epigenomics/standards , Hippocampus/metabolism , Animals , Databases, Genetic , Male , Observer Variation , Quality Control , RNA-Seq/standards , Rats, Sprague-Dawley , Reproducibility of Results
8.
Gene ; 814: 146161, 2022 Mar 10.
Article in English | MEDLINE | ID: mdl-34995736

ABSTRACT

The patients with hepatic alveolar echinococcosis is poorly detected due to invasive and slow growth. Thus, early diagnosis of hepatic alveolar echinococcosis is so important for patients. Circular RNAs are crucial types of the non-coding RNA. Recent studies have provided serum-derived exosomal circRNAs as potential biomarkers for detection of various diseases. The clinical importance of exosomal circRNAs in hepatic alveolar echinococcosis have never been explored before. Here, we investigated the serum-derived exosomal circRNAs in the diagnosis of hepatic alveolar echinococcosis. Firstly, High-throughput Sequencing was performed using 9 hepatic alveolar echinococcosis and 9 control samples to detect hepatic alveolar echinococcosis related circRNAs. Afterwards, bioinformatic analyzes were performed to identify differentially expressed circRNAs and pathway analyzes were performed. Finally, validation of the determined circRNAs was performed using RT-PCR. The sequencing data indicated that 59 differentially expressed circRNAs; 31 up-regulated and 28 down-regulated circRNA in hepatic alveolar echinococcosis patients. The top 5 up-regulated and down-regulated circRNAs were selected for validation by RT-qPCR assay. As a result of the verification, circRNAs that were significantly up- and down-regulated showed an expression profile consistent with the results obtained. Importantly, our findings suggested that identified exosomal circRNAs could be a potential biomarker for the detection of hepatic alveolar echinococcosis serum and may help to understand the pathogenesis of hepatic alveolar echinococcosis.


Subject(s)
Echinococcosis, Hepatic/genetics , Exosomes/genetics , RNA, Circular/blood , Biomarkers/blood , Echinococcosis, Hepatic/blood , Gene Regulatory Networks , High-Throughput Nucleotide Sequencing/standards , Humans , Quality Control , RNA-Seq/standards , Transcriptome
9.
Nucleic Acids Res ; 50(2): e12, 2022 01 25.
Article in English | MEDLINE | ID: mdl-34850101

ABSTRACT

Considerable effort has been devoted to refining experimental protocols to reduce levels of technical variability and artifacts in single-cell RNA-sequencing data (scRNA-seq). We here present evidence that equalizing the concentration of cDNA libraries prior to pooling, a step not consistently performed in single-cell experiments, improves gene detection rates, enhances biological signals, and reduces technical artifacts in scRNA-seq data. To evaluate the effect of equalization on various protocols, we developed Scaffold, a simulation framework that models each step of an scRNA-seq experiment. Numerical experiments demonstrate that equalization reduces variation in sequencing depth and gene-specific expression variability. We then performed a set of experiments in vitro with and without the equalization step and found that equalization increases the number of genes that are detected in every cell by 17-31%, improves discovery of biologically relevant genes, and reduces nuisance signals associated with cell cycle. Further support is provided in an analysis of publicly available data.


Subject(s)
Gene Library , RNA-Seq/methods , Single-Cell Analysis/methods , Algorithms , Computational Biology/methods , Databases, Genetic , Gene Expression Profiling/methods , Humans , RNA-Seq/standards , Sequence Analysis, RNA/methods , Single-Cell Analysis/standards , Software
10.
Nucleic Acids Res ; 49(15): 8505-8519, 2021 09 07.
Article in English | MEDLINE | ID: mdl-34320202

ABSTRACT

The transcriptomic diversity of cell types in the human body can be analysed in unprecedented detail using single cell (SC) technologies. Unsupervised clustering of SC transcriptomes, which is the default technique for defining cell types, is prone to group cells by technical, rather than biological, variation. Compared to de-novo (unsupervised) clustering, we demonstrate using multiple benchmarks that supervised clustering, which uses reference transcriptomes as a guide, is robust to batch effects and data quality artifacts. Here, we present RCA2, the first algorithm to combine reference projection (batch effect robustness) with graph-based clustering (scalability). In addition, RCA2 provides a user-friendly framework incorporating multiple commonly used downstream analysis modules. RCA2 also provides new reference panels for human and mouse and supports generation of custom panels. Furthermore, RCA2 facilitates cell type-specific QC, which is essential for accurate clustering of data from heterogeneous tissues. We demonstrate the advantages of RCA2 on SC data from human bone marrow, healthy PBMCs and PBMCs from COVID-19 patients. Scalable supervised clustering methods such as RCA2 will facilitate unified analysis of cohort-scale SC datasets.


Subject(s)
Algorithms , Cluster Analysis , RNA, Small Cytoplasmic/genetics , RNA-Seq/methods , Single-Cell Analysis/methods , Animals , Arthritis, Rheumatoid/genetics , Bone Marrow Cells/metabolism , COVID-19/blood , COVID-19/pathology , Cohort Studies , Datasets as Topic , Humans , Leukocytes, Mononuclear/metabolism , Leukocytes, Mononuclear/pathology , Mice , Organ Specificity , Quality Control , RNA-Seq/standards , Single-Cell Analysis/standards , Transcriptome
11.
Nucleic Acids Res ; 49(16): e92, 2021 09 20.
Article in English | MEDLINE | ID: mdl-34157120

ABSTRACT

N6-methyladenosine (m6A) is the most abundant internal RNA modification in eukaryotic mRNAs and influences many aspects of RNA processing. miCLIP (m6A individual-nucleotide resolution UV crosslinking and immunoprecipitation) is an antibody-based approach to map m6A sites with single-nucleotide resolution. However, due to broad antibody reactivity, reliable identification of m6A sites from miCLIP data remains challenging. Here, we present miCLIP2 in combination with machine learning to significantly improve m6A detection. The optimized miCLIP2 results in high-complexity libraries from less input material. Importantly, we established a robust computational pipeline to tackle the inherent issue of false positives in antibody-based m6A detection. The analyses were calibrated with Mettl3 knockout cells to learn the characteristics of m6A deposition, including m6A sites outside of DRACH motifs. To make our results universally applicable, we trained a machine learning model, m6Aboost, based on the experimental and RNA sequence features. Importantly, m6Aboost allows prediction of genuine m6A sites in miCLIP2 data without filtering for DRACH motifs or the need for Mettl3 depletion. Using m6Aboost, we identify thousands of high-confidence m6A sites in different murine and human cell lines, which provide a rich resource for future analysis. Collectively, our combined experimental and computational methodology greatly improves m6A identification.


Subject(s)
Adenosine/analogs & derivatives , Machine Learning , RNA Processing, Post-Transcriptional , RNA-Seq/methods , Adenosine/chemistry , Adenosine/metabolism , Animals , HEK293 Cells , Humans , Methyltransferases/genetics , Methyltransferases/metabolism , Mice , Mouse Embryonic Stem Cells/metabolism , Nucleotide Motifs , RNA, Messenger/chemistry , RNA, Messenger/metabolism , RNA-Seq/standards , Sensitivity and Specificity
12.
J Mol Diagn ; 23(8): 1015-1029, 2021 08.
Article in English | MEDLINE | ID: mdl-34082071

ABSTRACT

Targeted RNA sequencing (RNA-seq) is a highly accurate method for sequencing transcripts of interest with a high resolution and throughput. However, RNA-seq has not been widely performed in clinical molecular laboratories because of the complexity of data processing and interpretation. We developed and validated a customized RNA-seq panel and data processing protocol for fusion detection using 4 analytical validation samples and 51 clinical samples, covering seven types of hematologic malignancies. Analytical validation showed that the results for target gene coverage and between- and within-run precision and linearity tests were reliable. Using clinical samples, RNA-seq based on filtering and prioritization strategies detected all 25 known fusions previously found by multiplex reverse transcriptase-PCR and fluorescence in situ hybridization. It also detected nine novel fusions. Known fusions detected by RNA-seq included two IGH rearrangements supported by expression analysis. Novel fusions included six that targeted just one partner gene. In addition, 18 disease- and drug resistance-associated transcript variants in ABL1, GATA2, IKZF1, JAK2, RUNX1, and WT1 were designated simultaneously. Expression analysis showed distinct clustering according to subtype and lineage. In conclusion, this study showed that our customized RNA-seq system had a reliable and stable performance for fusion detection, with enhanced diagnostic yield for hematologic malignancies in a clinical diagnostic setting.


Subject(s)
Biomarkers, Tumor , Hematologic Neoplasms/diagnosis , Hematologic Neoplasms/genetics , Oncogene Proteins, Fusion/genetics , RNA-Seq/methods , Computational Biology/methods , Disease Management , High-Throughput Nucleotide Sequencing , Humans , Laboratories, Clinical , Quality Control , RNA-Seq/standards , Reproducibility of Results , Sensitivity and Specificity , Software
13.
Elife ; 102021 05 26.
Article in English | MEDLINE | ID: mdl-34037521

ABSTRACT

Use of adaptive immune receptor repertoire sequencing (AIRR-seq) has become widespread, providing new insights into the immune system with potential broad clinical and diagnostic applications. However, like many high-throughput technologies, it comes with several problems, and the AIRR Community was established to understand and help solve them. We, the AIRR Community's Biological Resources Working Group, have surveyed scientists about the need for standards and controls in generating and annotating AIRR-seq data. Here, we review the current status of AIRR-seq, provide the results of our survey, and based on them, offer recommendations for developing AIRR-seq standards and controls, including future work.


Subject(s)
Adaptive Immunity/genetics , Gene Expression Profiling/standards , RNA-Seq/standards , Receptors, Immunologic/genetics , Transcriptome , Animals , Databases, Genetic , Humans , Observer Variation , Quality Control , Reference Standards , Reproducibility of Results
14.
Biomed Res Int ; 2021: 6647597, 2021.
Article in English | MEDLINE | ID: mdl-33987443

ABSTRACT

Although RNA sequencing (RNA-seq) has become the most advanced technology for transcriptome analysis, it also confronts various challenges. As we all know, the workflow of RNA-seq is extremely complicated and it is easy to produce bias. This may damage the quality of RNA-seq dataset and lead to an incorrect interpretation for sequencing result. Thus, our detailed understanding of the source and nature of these biases is essential for the interpretation of RNA-seq data, finding methods to improve the quality of RNA-seq experimental, or development bioinformatics tools to compensate for these biases. Here, we discuss the sources of experimental bias in RNA-seq. And for each type of bias, we discussed the method for improvement, in order to provide some useful suggestions for researcher in RNA-seq experimental.


Subject(s)
Gene Library , RNA-Seq , RNA , Bias , Computational Biology/standards , Humans , RNA/analysis , RNA/genetics , RNA-Seq/methods , RNA-Seq/standards , Workflow
15.
Genes Cells ; 26(7): 530-540, 2021 Jul.
Article in English | MEDLINE | ID: mdl-33987903

ABSTRACT

Single-cell RNA-sequencing analysis is one of the most effective tools for understanding specific cellular states. The use of single cells or pooled cells in RNA-seq analysis requires the isolation of cells from a tissue or culture. Although trypsin or more recently cold-active protease (CAP) has been used for cell dissociation, the extent to which the gene expression changes are suppressed has not been clarified. To this end, we conducted detailed profiling of the enzyme-dependent gene expression changes in mouse skeletal muscle progenitor cells, focusing on the enzyme treatment time, amount and temperature. We found that the genes whose expression was changed by the enzyme treatment could be classified in a time-dependent manner and that there were genes whose expression was changed independently of the enzyme treatment time, amount and temperature. This study will be useful as reference data for genes that should be excluded or considered for RNA-seq analysis using enzyme isolation methods.


Subject(s)
Myoblasts/metabolism , RNA-Seq/methods , Transcriptome , Animals , Cell Line , Mice , Myoblasts/drug effects , NIH 3T3 Cells , RNA-Seq/standards , Trypsin/pharmacology
16.
Genome Biol ; 22(1): 121, 2021 04 29.
Article in English | MEDLINE | ID: mdl-33926528

ABSTRACT

Advances in transcriptome sequencing allow for simultaneous interrogation of differentially expressed genes from multiple species originating from a single RNA sample, termed dual or multi-species transcriptomics. Compared to single-species differential expression analysis, the design of multi-species differential expression experiments must account for the relative abundances of each organism of interest within the sample, often requiring enrichment methods and yielding differences in total read counts across samples. The analysis of multi-species transcriptomics datasets requires modifications to the alignment, quantification, and downstream analysis steps compared to the single-species analysis pipelines. We describe best practices for multi-species transcriptomics and differential gene expression.


Subject(s)
Gene Expression Profiling/methods , High-Throughput Nucleotide Sequencing , RNA-Seq/methods , Transcriptome , Animals , Eukaryota/genetics , Gene Expression Profiling/standards , Gene Expression Regulation , Humans , Organ Specificity , Prokaryotic Cells/metabolism , RNA/genetics , RNA-Seq/standards , ROC Curve , Sequence Alignment , Sequence Analysis, RNA/methods , Single-Cell Analysis/methods , Workflow
17.
Methods Mol Biol ; 2284: 303-329, 2021.
Article in English | MEDLINE | ID: mdl-33835450

ABSTRACT

Normalization is an important step in the analysis of single-cell RNA-seq data. While no single method outperforms all others in all datasets, the choice of normalization can have profound impact on the results. Data-driven metrics can be used to rank normalization methods and select the best performers. Here, we show how to use R/Bioconductor to calculate normalization factors, apply them to compute normalized data, and compare several normalization approaches. Finally, we briefly show how to perform downstream analysis steps on the normalized data.


Subject(s)
RNA-Seq/standards , Single-Cell Analysis/standards , Animals , Computational Biology/methods , Computational Biology/standards , Gene Expression Profiling/methods , Gene Expression Profiling/standards , High-Throughput Nucleotide Sequencing/methods , High-Throughput Nucleotide Sequencing/standards , Humans , Quality Control , RNA-Seq/methods , Reference Standards , Sequence Analysis, RNA/methods , Sequence Analysis, RNA/standards , Single-Cell Analysis/methods , Software , Transcriptome , Exome Sequencing
18.
Methods Mol Biol ; 2284: 331-342, 2021.
Article in English | MEDLINE | ID: mdl-33835451

ABSTRACT

Dimensionality reduction is a crucial step in essentially every single-cell RNA-sequencing (scRNA-seq) analysis. In this chapter, we describe the typical dimensionality reduction workflow that is used for scRNA-seq datasets, specifically highlighting the roles of principal component analysis, t-distributed stochastic neighborhood embedding, and uniform manifold approximation and projection in this setting. We particularly emphasize efficient computation; the software implementations used in this chapter can scale to datasets with millions of cells.


Subject(s)
Computational Biology/methods , RNA-Seq , Single-Cell Analysis , Algorithms , Animals , Data Analysis , Datasets as Topic/statistics & numerical data , Humans , Principal Component Analysis , RNA-Seq/methods , RNA-Seq/standards , RNA-Seq/statistics & numerical data , Single-Cell Analysis/methods , Single-Cell Analysis/standards , Single-Cell Analysis/statistics & numerical data , Software
19.
Genome Biol ; 22(1): 102, 2021 04 12.
Article in English | MEDLINE | ID: mdl-33845875

ABSTRACT

BACKGROUND: Deconvolution analyses have been widely used to track compositional alterations of cell types in gene expression data. Although a large number of novel methods have been developed, due to a lack of understanding of the effects of modeling assumptions and tuning parameters, it is challenging for researchers to select an optimal deconvolution method suitable for the targeted biological conditions. RESULTS: To systematically reveal the pitfalls and challenges of deconvolution analyses, we investigate the impact of several technical and biological factors including simulation model, quantification unit, component number, weight matrix, and unknown content by constructing three benchmarking frameworks. These frameworks cover comparative analysis of 11 popular deconvolution methods under 1766 conditions. CONCLUSIONS: We provide new insights to researchers for future application, standardization, and development of deconvolution tools on RNA-seq data.


Subject(s)
Computational Biology/methods , RNA-Seq/methods , Software , Computational Biology/standards , Gene Expression Profiling/methods , RNA-Seq/standards , Reproducibility of Results
20.
Genes Chromosomes Cancer ; 60(7): 504-524, 2021 07.
Article in English | MEDLINE | ID: mdl-33611828

ABSTRACT

The ability to capture alterations in the genome or transcriptome by next-generation sequencing has provided critical insight into molecular changes and programs underlying cancer biology. With the rapid technological development in single-cell sequencing, it has become possible to study individual cells at the transcriptional, genetic, epigenetic, and protein level. Using single-cell analysis, an increased resolution of fundamental processes underlying cancer development is obtained, providing comprehensive insights otherwise lost by sequencing of entire (bulk) samples, in which molecular signatures of individual cells are averaged across the entire cell population. Here, we provide a concise overview on the application of single-cell analysis of different modalities within cancer research by highlighting key articles of their respective fields. We furthermore examine the potential of existing technologies to meet clinical diagnostic needs and discuss current challenges associated with this translation.


Subject(s)
Genetic Testing/methods , Neoplasms/genetics , RNA-Seq/methods , Single-Cell Analysis/methods , Translational Research, Biomedical/methods , Animals , Genetic Testing/standards , Humans , Neoplasms/diagnosis , RNA-Seq/standards , Single-Cell Analysis/standards , Translational Research, Biomedical/standards
SELECTION OF CITATIONS
SEARCH DETAIL
...