Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 57
Filtrar
1.
Nat Commun ; 15(1): 6167, 2024 Jul 22.
Artículo en Inglés | MEDLINE | ID: mdl-39039053

RESUMEN

Translating RNA-seq into clinical diagnostics requires ensuring the reliability and cross-laboratory consistency of detecting clinically relevant subtle differential expressions, such as those between different disease subtypes or stages. As part of the Quartet project, we present an RNA-seq benchmarking study across 45 laboratories using the Quartet and MAQC reference samples spiked with ERCC controls. Based on multiple types of 'ground truth', we systematically assess the real-world RNA-seq performance and investigate the influencing factors involved in 26 experimental processes and 140 bioinformatics pipelines. Here we show greater inter-laboratory variations in detecting subtle differential expressions among the Quartet samples. Experimental factors including mRNA enrichment and strandedness, and each bioinformatics step, emerge as primary sources of variations in gene expression. We underscore the profound influence of experimental execution, and provide best practice recommendations for experimental designs, strategies for filtering low-expression genes, and the optimal gene annotation and analysis pipelines. In summary, this study lays the foundation for developing and quality control of RNA-seq for clinical diagnostic purposes.


Asunto(s)
Benchmarking , Biología Computacional , Control de Calidad , RNA-Seq , Estándares de Referencia , Benchmarking/métodos , Humanos , RNA-Seq/métodos , RNA-Seq/normas , Biología Computacional/métodos , Reproducibilidad de los Resultados , Análisis de Secuencia de ARN/métodos , Análisis de Secuencia de ARN/normas , Perfilación de la Expresión Génica/métodos , Perfilación de la Expresión Génica/normas , ARN Mensajero/genética , ARN Mensajero/metabolismo
2.
BMC Genomics ; 25(1): 697, 2024 Jul 16.
Artículo en Inglés | MEDLINE | ID: mdl-39014352

RESUMEN

BACKGROUND: Real-time quantitative PCR (RT-qPCR) is one of the most widely used gene expression analyses for validating RNA-seq data. This technique requires reference genes that are stable and highly expressed, at least across the different biological conditions present in the transcriptome. Reference and variable candidate gene selection is often neglected, leading to misinterpretation of the results. RESULTS: We developed a software named "Gene Selector for Validation" (GSV), which identifies the best reference and variable candidate genes for validation within a quantitative transcriptome. This tool also filters the candidate genes concerning the RT-qPCR assay detection limit. GSV was compared with other software using synthetic datasets and performed better, removing stable low-expression genes from the reference candidate list and creating the variable-expression validation list. GSV software was used on a real case, an Aedes aegypti transcriptome. The top GSV reference candidate genes were selected for RT-qPCR analysis, confirming that eiF1A and eiF3j were the most stable genes tested. The tool also confirmed that traditional mosquito reference genes were less stable in the analyzed samples, highlighting the possibility of inappropriate choices. A meta-transcriptome dataset with more than ninety thousand genes was also processed successfully. CONCLUSION: The GSV tool is a time and cost-effective tool that can be used to select reference and validation candidate genes from the biological conditions present in transcriptomic data.


Asunto(s)
Reacción en Cadena en Tiempo Real de la Polimerasa , Estándares de Referencia , Programas Informáticos , Reacción en Cadena en Tiempo Real de la Polimerasa/métodos , Reacción en Cadena en Tiempo Real de la Polimerasa/normas , Animales , RNA-Seq/métodos , RNA-Seq/normas , Perfilación de la Expresión Génica/métodos , Transcriptoma
3.
Eur J Orthod ; 46(4)2024 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-39066623

RESUMEN

BACKGROUND: The robustness and credibility of RT-qPCR results are critically dependent on the selection of suitable reference genes. However, the mineralization of the extracellular matrix can alter the intracellular tension and energy metabolism within cells, potentially impacting the expression of traditional reference genes, namely Actb and Gapdh. OBJECTIVE: To methodically identify appropriate reference genes for research focused on mouse cementoblast mineralization. MATERIALS AND METHODS: Time-series transcriptomic data of mouse cementoblast mineralization were used. To ensure expression stability and medium to high expression levels, three specific criteria were applied to select potential reference genes. The expression stability of these genes was ranked based on the DI index (1/coefficient of variation) to identify the top six potential reference genes. RT-qPCR validation was performed on these top six candidates, comparing their performance against six previously used reference genes (Rpl22, Ppib, Gusb, Rplp0, Actb, and Gapdh). Cq values of these 12 genes were analyzed by RefFinder to get a stability ranking. RESULTS: A total of 4418 (12.27%) genes met the selection criteria. Among them, Rab5if, Chmp4b, Birc5, Pea15a, Nudc, Supt4a were identified as candidate reference genes. RefFinder analyses revealed that two candidates (Birc5 and Nudc) exhibited superior performance compared to previously used reference genes. LIMITATIONS: RefFinder's stability ranking does not consider the influence of primer efficiency. CONCLUSIONS AND IMPLICATIONS: We propose Birc5 and Nudc as candidate reference genes for RT-qPCR studies investigating mouse cementoblast mineralization and cementum repair.


Asunto(s)
Cemento Dental , Reacción en Cadena en Tiempo Real de la Polimerasa , Survivin , Animales , Ratones , Reacción en Cadena en Tiempo Real de la Polimerasa/métodos , Reacción en Cadena en Tiempo Real de la Polimerasa/normas , Survivin/genética , Estándares de Referencia , RNA-Seq/métodos , RNA-Seq/normas , Calcificación Fisiológica/genética
4.
Artículo en Inglés | MEDLINE | ID: mdl-39049508

RESUMEN

Gene set scoring (GSS) has been routinely conducted for gene expression analysis of bulk or single-cell RNA sequencing (RNA-seq) data, which helps to decipher single-cell heterogeneity and cell type-specific variability by incorporating prior knowledge from functional gene sets. Single-cell assay for transposase accessible chromatin using sequencing (scATAC-seq) is a powerful technique for interrogating single-cell chromatin-based gene regulation, and genes or gene sets with dynamic regulatory potentials can be regarded as cell type-specific markers as if in single-cell RNA-seq (scRNA-seq). However, there are few GSS tools specifically designed for scATAC-seq, and the applicability and performance of RNA-seq GSS tools on scATAC-seq data remain to be investigated. Here, we systematically benchmarked ten GSS tools, including four bulk RNA-seq tools, five scRNA-seq tools, and one scATAC-seq method. First, using matched scATAC-seq and scRNA-seq datasets, we found that the performance of GSS tools on scATAC-seq data was comparable to that on scRNA-seq, suggesting their applicability to scATAC-seq. Then, the performance of different GSS tools was extensively evaluated using up to ten scATAC-seq datasets. Moreover, we evaluated the impact of gene activity conversion, dropout imputation, and gene set collections on the results of GSS. Results show that dropout imputation can significantly promote the performance of almost all GSS tools, while the impact of gene activity conversion methods or gene set collections on GSS performance is more dependent on GSS tools or datasets. Finally, we provided practical guidelines for choosing appropriate preprocessing methods and GSS tools in different application scenarios.


Asunto(s)
Algoritmos , Benchmarking , Secuenciación de Inmunoprecipitación de Cromatina , Análisis de la Célula Individual , Análisis de la Célula Individual/métodos , Análisis de la Célula Individual/normas , Humanos , Secuenciación de Inmunoprecipitación de Cromatina/métodos , RNA-Seq/métodos , RNA-Seq/normas , Análisis de Secuencia de ARN/métodos , Análisis de Secuencia de ARN/normas , Perfilación de la Expresión Génica/métodos , Perfilación de la Expresión Génica/normas , Cromatina/genética , Cromatina/metabolismo
5.
Genome Biol ; 25(1): 145, 2024 06 03.
Artículo en Inglés | MEDLINE | ID: mdl-38831386

RESUMEN

BACKGROUND: Single-cell RNA sequencing (scRNA-seq) and spatially resolved transcriptomics (SRT) have led to groundbreaking advancements in life sciences. To develop bioinformatics tools for scRNA-seq and SRT data and perform unbiased benchmarks, data simulation has been widely adopted by providing explicit ground truth and generating customized datasets. However, the performance of simulation methods under multiple scenarios has not been comprehensively assessed, making it challenging to choose suitable methods without practical guidelines. RESULTS: We systematically evaluated 49 simulation methods developed for scRNA-seq and/or SRT data in terms of accuracy, functionality, scalability, and usability using 152 reference datasets derived from 24 platforms. SRTsim, scDesign3, ZINB-WaVE, and scDesign2 have the best accuracy performance across various platforms. Unexpectedly, some methods tailored to scRNA-seq data have potential compatibility for simulating SRT data. Lun, SPARSim, and scDesign3-tree outperform other methods under corresponding simulation scenarios. Phenopath, Lun, Simple, and MFA yield high scalability scores but they cannot generate realistic simulated data. Users should consider the trade-offs between method accuracy and scalability (or functionality) when making decisions. Additionally, execution errors are mainly caused by failed parameter estimations and appearance of missing or infinite values in calculations. We provide practical guidelines for method selection, a standard pipeline Simpipe ( https://github.com/duohongrui/simpipe ; https://doi.org/10.5281/zenodo.11178409 ), and an online tool Simsite ( https://www.ciblab.net/software/simshiny/ ) for data simulation. CONCLUSIONS: No method performs best on all criteria, thus a good-yet-not-the-best method is recommended if it solves problems effectively and reasonably. Our comprehensive work provides crucial insights for developers on modeling gene expression data and fosters the simulation process for users.


Asunto(s)
Perfilación de la Expresión Génica , Análisis de la Célula Individual , Análisis de la Célula Individual/métodos , Perfilación de la Expresión Génica/métodos , Humanos , Programas Informáticos , Simulación por Computador , Transcriptoma , Biología Computacional/métodos , Análisis de Secuencia de ARN/métodos , RNA-Seq/métodos , RNA-Seq/normas
6.
BMC Genomics ; 25(1): 444, 2024 May 06.
Artículo en Inglés | MEDLINE | ID: mdl-38711017

RESUMEN

BACKGROUND: Normalization is a critical step in the analysis of single-cell RNA-sequencing (scRNA-seq) datasets. Its main goal is to make gene counts comparable within and between cells. To do so, normalization methods must account for technical and biological variability. Numerous normalization methods have been developed addressing different sources of dispersion and making specific assumptions about the count data. MAIN BODY: The selection of a normalization method has a direct impact on downstream analysis, for example differential gene expression and cluster identification. Thus, the objective of this review is to guide the reader in making an informed decision on the most appropriate normalization method to use. To this aim, we first give an overview of the different single cell sequencing platforms and methods commonly used including isolation and library preparation protocols. Next, we discuss the inherent sources of variability of scRNA-seq datasets. We describe the categories of normalization methods and include examples of each. We also delineate imputation and batch-effect correction methods. Furthermore, we describe data-driven metrics commonly used to evaluate the performance of normalization methods. We also discuss common scRNA-seq methods and toolkits used for integrated data analysis. CONCLUSIONS: According to the correction performed, normalization methods can be broadly classified as within and between-sample algorithms. Moreover, with respect to the mathematical model used, normalization methods can further be classified into: global scaling methods, generalized linear models, mixed methods, and machine learning-based methods. Each of these methods depict pros and cons and make different statistical assumptions. However, there is no better performing normalization method. Instead, metrics such as silhouette width, K-nearest neighbor batch-effect test, or Highly Variable Genes are recommended to assess the performance of normalization methods.


Asunto(s)
Análisis de la Célula Individual , Animales , Humanos , Algoritmos , Perfilación de la Expresión Génica/métodos , Perfilación de la Expresión Génica/normas , RNA-Seq/métodos , RNA-Seq/normas , Análisis de Secuencia de ARN/métodos , Análisis de la Célula Individual/métodos , Transcriptoma , Conjuntos de Datos como Asunto
7.
J Biol Chem ; 299(6): 104810, 2023 06.
Artículo en Inglés | MEDLINE | ID: mdl-37172729

RESUMEN

RNA sequencing (RNA-seq) is a powerful technique for understanding cellular state and dynamics. However, comprehensive transcriptomic characterization of multiple RNA-seq datasets is laborious without bioinformatics training and skills. To remove the barriers to sequence data analysis in the research community, we have developed "RNAseqChef" (RNA-seq data controller highlighting expression features), a web-based platform of systematic transcriptome analysis that can automatically detect, integrate, and visualize differentially expressed genes and their biological functions. To validate its versatile performance, we examined the pharmacological action of sulforaphane (SFN), a natural isothiocyanate, on various types of cells and mouse tissues using multiple datasets in vitro and in vivo. Notably, SFN treatment upregulated the ATF6-mediated unfolded protein response in the liver and the NRF2-mediated antioxidant response in the skeletal muscle of diet-induced obese mice. In contrast, the commonly downregulated pathways included collagen synthesis and circadian rhythms in the tissues tested. On the server of RNAseqChef, we simply evaluated and visualized all analyzing data and discovered the NRF2-independent action of SFN. Collectively, RNAseqChef provides an easy-to-use open resource that identifies context-dependent transcriptomic features and standardizes data assessment.


Asunto(s)
Perfilación de la Expresión Génica , Internet , Isotiocianatos , RNA-Seq , Programas Informáticos , Sulfóxidos , Animales , Ratones , Perfilación de la Expresión Génica/métodos , Perfilación de la Expresión Génica/normas , Isotiocianatos/farmacología , Sulfóxidos/farmacología , RNA-Seq/métodos , RNA-Seq/normas , Especificidad de Órganos/efectos de los fármacos , Reproducibilidad de los Resultados , Ratones Obesos , Respuesta de Proteína Desplegada/efectos de los fármacos , Hígado/efectos de los fármacos , Músculo Esquelético/efectos de los fármacos , Antioxidantes/metabolismo , Visualización de Datos
8.
Nature ; 608(7924): 733-740, 2022 08.
Artículo en Inglés | MEDLINE | ID: mdl-35978187

RESUMEN

Single-cell transcriptomics (scRNA-seq) has greatly advanced our ability to characterize cellular heterogeneity1. However, scRNA-seq requires lysing cells, which impedes further molecular or functional analyses on the same cells. Here, we established Live-seq, a single-cell transcriptome profiling approach that preserves cell viability during RNA extraction using fluidic force microscopy2,3, thus allowing to couple a cell's ground-state transcriptome to its downstream molecular or phenotypic behaviour. To benchmark Live-seq, we used cell growth, functional responses and whole-cell transcriptome read-outs to demonstrate that Live-seq can accurately stratify diverse cell types and states without inducing major cellular perturbations. As a proof of concept, we show that Live-seq can be used to directly map a cell's trajectory by sequentially profiling the transcriptomes of individual macrophages before and after lipopolysaccharide (LPS) stimulation, and of adipose stromal cells pre- and post-differentiation. In addition, we demonstrate that Live-seq can function as a transcriptomic recorder by preregistering the transcriptomes of individual macrophages that were subsequently monitored by time-lapse imaging after LPS exposure. This enabled the unsupervised, genome-wide ranking of genes on the basis of their ability to affect macrophage LPS response heterogeneity, revealing basal Nfkbia expression level and cell cycle state as important phenotypic determinants, which we experimentally validated. Thus, Live-seq can address a broad range of biological questions by transforming scRNA-seq from an end-point to a temporal analysis approach.


Asunto(s)
Supervivencia Celular , Perfilación de la Expresión Génica , Macrófagos , RNA-Seq , Análisis de la Célula Individual , Transcriptoma , Tejido Adiposo/citología , Ciclo Celular/efectos de los fármacos , Ciclo Celular/genética , Diferenciación Celular , Perfilación de la Expresión Génica/métodos , Perfilación de la Expresión Génica/normas , Genoma/efectos de los fármacos , Genoma/genética , Lipopolisacáridos/inmunología , Lipopolisacáridos/farmacología , Macrófagos/citología , Macrófagos/efectos de los fármacos , Macrófagos/inmunología , Macrófagos/metabolismo , Inhibidor NF-kappaB alfa/genética , Especificidad de Órganos , Fenotipo , ARN/genética , ARN/aislamiento & purificación , RNA-Seq/métodos , RNA-Seq/normas , Reproducibilidad de los Resultados , Análisis de Secuencia de ARN/métodos , Análisis de Secuencia de ARN/normas , Análisis de la Célula Individual/métodos , Células del Estroma/citología , Células del Estroma/metabolismo , Factores de Tiempo , Transcriptoma/genética
9.
Sci Rep ; 12(1): 1789, 2022 02 02.
Artículo en Inglés | MEDLINE | ID: mdl-35110572

RESUMEN

Despite the recent precipitous decline in the cost of genome sequencing, library preparation for RNA-seq is still laborious and expensive for applications such as high throughput screening. Limited availability of RNA generated by some experimental workflows poses an additional challenge and increases the cost of RNA library preparation. In a search for low cost, automation-compatible RNA library preparation kits that maintain strand specificity and are amenable to low input RNA quantities, we systematically tested two recent commercial technologies-Swift RNA and Swift Rapid RNA, presently offered by Integrated DNA Technologies (IDT) -alongside the Illumina TruSeq stranded mRNA, the de facto standard workflow for bulk transcriptomics. We used the Universal Human Reference RNA (UHRR) (composed of equal quantities of total RNA from 10 human cancer cell lines) to benchmark gene expression in these kits, at input quantities ranging between 10 to 500 ng. We found normalized read counts between all treatment groups to be in high agreement. Compared to the Illumina TruSeq stranded mRNA kit, both Swift RNA library kits offer shorter workflow times enabled by their patented Adaptase technology. We also found the Swift RNA kit to produce the fewest number of differentially expressed genes and pathways directly attributable to input mRNA amount.


Asunto(s)
Biomarcadores de Tumor/genética , Biblioteca de Genes , Neoplasias/genética , ARN Neoplásico/análisis , RNA-Seq/métodos , RNA-Seq/normas , Transcriptoma , Perfilación de la Expresión Génica , Humanos , Neoplasias/patología , ARN Neoplásico/genética , Análisis de Secuencia de ARN/métodos , Células Tumorales Cultivadas
10.
Neurosci Lett ; 771: 136468, 2022 02 06.
Artículo en Inglés | MEDLINE | ID: mdl-35065247

RESUMEN

Recent RNA-seq studies have generated a new crop of putative gene markers for terminal Schwann cells (tSCs), non-myelinating glia that cap axon terminals at the vertebrate neuromuscular junction (NMJ). While compelling, these studies did not validate the expression of the novel markers using in situ hybridization techniques. Here, we use RNAscope technology to study the expression of top candidates from recent tSC and non-myelinating Schwann cell marker RNA-seq studies. Our results validate the expression of these markers at tSCs but also demonstrate that they are present at other sites in the muscle tissue, specifically, at muscle spindles and along intramuscular nerves.


Asunto(s)
Proteínas del Tejido Nervioso/genética , RNA-Seq/métodos , Células de Schwann/metabolismo , Animales , Femenino , Hibridación Fluorescente in Situ/métodos , Hibridación Fluorescente in Situ/normas , Masculino , Ratones , Ratones Endogámicos C57BL , Proteínas del Tejido Nervioso/metabolismo , Unión Neuromuscular/metabolismo , RNA-Seq/normas , Estándares de Referencia
11.
Sci Rep ; 12(1): 380, 2022 01 10.
Artículo en Inglés | MEDLINE | ID: mdl-35013473

RESUMEN

Epigenetic modifications are crucial for normal development and implicated in disease pathogenesis. While epigenetics continues to be a burgeoning research area in neuroscience, unaddressed issues related to data reproducibility across laboratories remain. Separating meaningful experimental changes from background variability is a challenge in epigenomic studies. Here we show that seemingly minor experimental variations, even under normal baseline conditions, can have a significant impact on epigenome outcome measures and data interpretation. We examined genome-wide DNA methylation and gene expression profiles of hippocampal tissues from wild-type rats housed in three independent laboratories using nearly identical conditions. Reduced-representation bisulfite sequencing and RNA-seq respectively identified 3852 differentially methylated and 1075 differentially expressed genes between laboratories, even in the absence of experimental intervention. Difficult-to-match factors such as animal vendors and a subset of husbandry and tissue extraction procedures produced quantifiable variations between wild-type animals across the three laboratories. Our study demonstrates that seemingly minor experimental variations, even under normal baseline conditions, can have a significant impact on epigenome outcome measures and data interpretation. This is particularly meaningful for neurological studies in animal models, in which baseline parameters between experimental groups are difficult to control. To enhance scientific rigor, we conclude that strict adherence to protocols is necessary for the execution and interpretation of epigenetic studies and that protocol-sensitive epigenetic changes, amongst naive animals, may confound experimental results.


Asunto(s)
Metilación de ADN , Epigénesis Genética , Epigenoma , Epigenómica/normas , Hipocampo/metabolismo , Animales , Bases de Datos Genéticas , Masculino , Variaciones Dependientes del Observador , Control de Calidad , RNA-Seq/normas , Ratas Sprague-Dawley , Reproducibilidad de los Resultados
12.
Gene ; 814: 146161, 2022 Mar 10.
Artículo en Inglés | MEDLINE | ID: mdl-34995736

RESUMEN

The patients with hepatic alveolar echinococcosis is poorly detected due to invasive and slow growth. Thus, early diagnosis of hepatic alveolar echinococcosis is so important for patients. Circular RNAs are crucial types of the non-coding RNA. Recent studies have provided serum-derived exosomal circRNAs as potential biomarkers for detection of various diseases. The clinical importance of exosomal circRNAs in hepatic alveolar echinococcosis have never been explored before. Here, we investigated the serum-derived exosomal circRNAs in the diagnosis of hepatic alveolar echinococcosis. Firstly, High-throughput Sequencing was performed using 9 hepatic alveolar echinococcosis and 9 control samples to detect hepatic alveolar echinococcosis related circRNAs. Afterwards, bioinformatic analyzes were performed to identify differentially expressed circRNAs and pathway analyzes were performed. Finally, validation of the determined circRNAs was performed using RT-PCR. The sequencing data indicated that 59 differentially expressed circRNAs; 31 up-regulated and 28 down-regulated circRNA in hepatic alveolar echinococcosis patients. The top 5 up-regulated and down-regulated circRNAs were selected for validation by RT-qPCR assay. As a result of the verification, circRNAs that were significantly up- and down-regulated showed an expression profile consistent with the results obtained. Importantly, our findings suggested that identified exosomal circRNAs could be a potential biomarker for the detection of hepatic alveolar echinococcosis serum and may help to understand the pathogenesis of hepatic alveolar echinococcosis.


Asunto(s)
Equinococosis Hepática/genética , Exosomas/genética , ARN Circular/sangre , Biomarcadores/sangre , Equinococosis Hepática/sangre , Redes Reguladoras de Genes , Secuenciación de Nucleótidos de Alto Rendimiento/normas , Humanos , Control de Calidad , RNA-Seq/normas , Transcriptoma
13.
Nucleic Acids Res ; 50(2): e12, 2022 01 25.
Artículo en Inglés | MEDLINE | ID: mdl-34850101

RESUMEN

Considerable effort has been devoted to refining experimental protocols to reduce levels of technical variability and artifacts in single-cell RNA-sequencing data (scRNA-seq). We here present evidence that equalizing the concentration of cDNA libraries prior to pooling, a step not consistently performed in single-cell experiments, improves gene detection rates, enhances biological signals, and reduces technical artifacts in scRNA-seq data. To evaluate the effect of equalization on various protocols, we developed Scaffold, a simulation framework that models each step of an scRNA-seq experiment. Numerical experiments demonstrate that equalization reduces variation in sequencing depth and gene-specific expression variability. We then performed a set of experiments in vitro with and without the equalization step and found that equalization increases the number of genes that are detected in every cell by 17-31%, improves discovery of biologically relevant genes, and reduces nuisance signals associated with cell cycle. Further support is provided in an analysis of publicly available data.


Asunto(s)
Biblioteca de Genes , RNA-Seq/métodos , Análisis de la Célula Individual/métodos , Algoritmos , Biología Computacional/métodos , Bases de Datos Genéticas , Perfilación de la Expresión Génica/métodos , Humanos , RNA-Seq/normas , Análisis de Secuencia de ARN/métodos , Análisis de la Célula Individual/normas , Programas Informáticos
14.
Nucleic Acids Res ; 49(15): 8505-8519, 2021 09 07.
Artículo en Inglés | MEDLINE | ID: mdl-34320202

RESUMEN

The transcriptomic diversity of cell types in the human body can be analysed in unprecedented detail using single cell (SC) technologies. Unsupervised clustering of SC transcriptomes, which is the default technique for defining cell types, is prone to group cells by technical, rather than biological, variation. Compared to de-novo (unsupervised) clustering, we demonstrate using multiple benchmarks that supervised clustering, which uses reference transcriptomes as a guide, is robust to batch effects and data quality artifacts. Here, we present RCA2, the first algorithm to combine reference projection (batch effect robustness) with graph-based clustering (scalability). In addition, RCA2 provides a user-friendly framework incorporating multiple commonly used downstream analysis modules. RCA2 also provides new reference panels for human and mouse and supports generation of custom panels. Furthermore, RCA2 facilitates cell type-specific QC, which is essential for accurate clustering of data from heterogeneous tissues. We demonstrate the advantages of RCA2 on SC data from human bone marrow, healthy PBMCs and PBMCs from COVID-19 patients. Scalable supervised clustering methods such as RCA2 will facilitate unified analysis of cohort-scale SC datasets.


Asunto(s)
Algoritmos , Análisis por Conglomerados , ARN Citoplasmático Pequeño/genética , RNA-Seq/métodos , Análisis de la Célula Individual/métodos , Animales , Artritis Reumatoide/genética , Células de la Médula Ósea/metabolismo , COVID-19/sangre , COVID-19/patología , Estudios de Cohortes , Conjuntos de Datos como Asunto , Humanos , Leucocitos Mononucleares/metabolismo , Leucocitos Mononucleares/patología , Ratones , Especificidad de Órganos , Control de Calidad , RNA-Seq/normas , Análisis de la Célula Individual/normas , Transcriptoma
15.
J Mol Diagn ; 23(8): 1015-1029, 2021 08.
Artículo en Inglés | MEDLINE | ID: mdl-34082071

RESUMEN

Targeted RNA sequencing (RNA-seq) is a highly accurate method for sequencing transcripts of interest with a high resolution and throughput. However, RNA-seq has not been widely performed in clinical molecular laboratories because of the complexity of data processing and interpretation. We developed and validated a customized RNA-seq panel and data processing protocol for fusion detection using 4 analytical validation samples and 51 clinical samples, covering seven types of hematologic malignancies. Analytical validation showed that the results for target gene coverage and between- and within-run precision and linearity tests were reliable. Using clinical samples, RNA-seq based on filtering and prioritization strategies detected all 25 known fusions previously found by multiplex reverse transcriptase-PCR and fluorescence in situ hybridization. It also detected nine novel fusions. Known fusions detected by RNA-seq included two IGH rearrangements supported by expression analysis. Novel fusions included six that targeted just one partner gene. In addition, 18 disease- and drug resistance-associated transcript variants in ABL1, GATA2, IKZF1, JAK2, RUNX1, and WT1 were designated simultaneously. Expression analysis showed distinct clustering according to subtype and lineage. In conclusion, this study showed that our customized RNA-seq system had a reliable and stable performance for fusion detection, with enhanced diagnostic yield for hematologic malignancies in a clinical diagnostic setting.


Asunto(s)
Biomarcadores de Tumor , Neoplasias Hematológicas/diagnóstico , Neoplasias Hematológicas/genética , Proteínas de Fusión Oncogénica/genética , RNA-Seq/métodos , Biología Computacional/métodos , Manejo de la Enfermedad , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Laboratorios Clínicos , Control de Calidad , RNA-Seq/normas , Reproducibilidad de los Resultados , Sensibilidad y Especificidad , Programas Informáticos
16.
Nucleic Acids Res ; 49(16): e92, 2021 09 20.
Artículo en Inglés | MEDLINE | ID: mdl-34157120

RESUMEN

N6-methyladenosine (m6A) is the most abundant internal RNA modification in eukaryotic mRNAs and influences many aspects of RNA processing. miCLIP (m6A individual-nucleotide resolution UV crosslinking and immunoprecipitation) is an antibody-based approach to map m6A sites with single-nucleotide resolution. However, due to broad antibody reactivity, reliable identification of m6A sites from miCLIP data remains challenging. Here, we present miCLIP2 in combination with machine learning to significantly improve m6A detection. The optimized miCLIP2 results in high-complexity libraries from less input material. Importantly, we established a robust computational pipeline to tackle the inherent issue of false positives in antibody-based m6A detection. The analyses were calibrated with Mettl3 knockout cells to learn the characteristics of m6A deposition, including m6A sites outside of DRACH motifs. To make our results universally applicable, we trained a machine learning model, m6Aboost, based on the experimental and RNA sequence features. Importantly, m6Aboost allows prediction of genuine m6A sites in miCLIP2 data without filtering for DRACH motifs or the need for Mettl3 depletion. Using m6Aboost, we identify thousands of high-confidence m6A sites in different murine and human cell lines, which provide a rich resource for future analysis. Collectively, our combined experimental and computational methodology greatly improves m6A identification.


Asunto(s)
Adenosina/análogos & derivados , Aprendizaje Automático , Procesamiento Postranscripcional del ARN , RNA-Seq/métodos , Adenosina/química , Adenosina/metabolismo , Animales , Células HEK293 , Humanos , Metiltransferasas/genética , Metiltransferasas/metabolismo , Ratones , Células Madre Embrionarias de Ratones/metabolismo , Motivos de Nucleótidos , ARN Mensajero/química , ARN Mensajero/metabolismo , RNA-Seq/normas , Sensibilidad y Especificidad
17.
Biomed Res Int ; 2021: 6647597, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-33987443

RESUMEN

Although RNA sequencing (RNA-seq) has become the most advanced technology for transcriptome analysis, it also confronts various challenges. As we all know, the workflow of RNA-seq is extremely complicated and it is easy to produce bias. This may damage the quality of RNA-seq dataset and lead to an incorrect interpretation for sequencing result. Thus, our detailed understanding of the source and nature of these biases is essential for the interpretation of RNA-seq data, finding methods to improve the quality of RNA-seq experimental, or development bioinformatics tools to compensate for these biases. Here, we discuss the sources of experimental bias in RNA-seq. And for each type of bias, we discussed the method for improvement, in order to provide some useful suggestions for researcher in RNA-seq experimental.


Asunto(s)
Biblioteca de Genes , RNA-Seq , ARN , Sesgo , Biología Computacional/normas , Humanos , ARN/análisis , ARN/genética , RNA-Seq/métodos , RNA-Seq/normas , Flujo de Trabajo
18.
Genes Cells ; 26(7): 530-540, 2021 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-33987903

RESUMEN

Single-cell RNA-sequencing analysis is one of the most effective tools for understanding specific cellular states. The use of single cells or pooled cells in RNA-seq analysis requires the isolation of cells from a tissue or culture. Although trypsin or more recently cold-active protease (CAP) has been used for cell dissociation, the extent to which the gene expression changes are suppressed has not been clarified. To this end, we conducted detailed profiling of the enzyme-dependent gene expression changes in mouse skeletal muscle progenitor cells, focusing on the enzyme treatment time, amount and temperature. We found that the genes whose expression was changed by the enzyme treatment could be classified in a time-dependent manner and that there were genes whose expression was changed independently of the enzyme treatment time, amount and temperature. This study will be useful as reference data for genes that should be excluded or considered for RNA-seq analysis using enzyme isolation methods.


Asunto(s)
Mioblastos/metabolismo , RNA-Seq/métodos , Transcriptoma , Animales , Línea Celular , Ratones , Mioblastos/efectos de los fármacos , Células 3T3 NIH , RNA-Seq/normas , Tripsina/farmacología
19.
Elife ; 102021 05 26.
Artículo en Inglés | MEDLINE | ID: mdl-34037521

RESUMEN

Use of adaptive immune receptor repertoire sequencing (AIRR-seq) has become widespread, providing new insights into the immune system with potential broad clinical and diagnostic applications. However, like many high-throughput technologies, it comes with several problems, and the AIRR Community was established to understand and help solve them. We, the AIRR Community's Biological Resources Working Group, have surveyed scientists about the need for standards and controls in generating and annotating AIRR-seq data. Here, we review the current status of AIRR-seq, provide the results of our survey, and based on them, offer recommendations for developing AIRR-seq standards and controls, including future work.


Asunto(s)
Inmunidad Adaptativa/genética , Perfilación de la Expresión Génica/normas , RNA-Seq/normas , Receptores Inmunológicos/genética , Transcriptoma , Animales , Bases de Datos Genéticas , Humanos , Variaciones Dependientes del Observador , Control de Calidad , Estándares de Referencia , Reproducibilidad de los Resultados
20.
Genome Biol ; 22(1): 121, 2021 04 29.
Artículo en Inglés | MEDLINE | ID: mdl-33926528

RESUMEN

Advances in transcriptome sequencing allow for simultaneous interrogation of differentially expressed genes from multiple species originating from a single RNA sample, termed dual or multi-species transcriptomics. Compared to single-species differential expression analysis, the design of multi-species differential expression experiments must account for the relative abundances of each organism of interest within the sample, often requiring enrichment methods and yielding differences in total read counts across samples. The analysis of multi-species transcriptomics datasets requires modifications to the alignment, quantification, and downstream analysis steps compared to the single-species analysis pipelines. We describe best practices for multi-species transcriptomics and differential gene expression.


Asunto(s)
Perfilación de la Expresión Génica/métodos , Secuenciación de Nucleótidos de Alto Rendimiento , RNA-Seq/métodos , Transcriptoma , Animales , Eucariontes/genética , Perfilación de la Expresión Génica/normas , Regulación de la Expresión Génica , Humanos , Especificidad de Órganos , Células Procariotas/metabolismo , ARN/genética , RNA-Seq/normas , Curva ROC , Alineación de Secuencia , Análisis de Secuencia de ARN/métodos , Análisis de la Célula Individual/métodos , Flujo de Trabajo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA