Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 1.126
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Cell ; 185(1): 184-203.e19, 2022 01 06.
Artículo en Inglés | MEDLINE | ID: mdl-34963056

RESUMEN

Cancers display significant heterogeneity with respect to tissue of origin, driver mutations, and other features of the surrounding tissue. It is likely that individual tumors engage common patterns of the immune system-here "archetypes"-creating prototypical non-destructive tumor immune microenvironments (TMEs) and modulating tumor-targeting. To discover the dominant immune system archetypes, the University of California, San Francisco (UCSF) Immunoprofiler Initiative (IPI) processed 364 individual tumors across 12 cancer types using standardized protocols. Computational clustering of flow cytometry and transcriptomic data obtained from cell sub-compartments uncovered dominant patterns of immune composition across cancers. These archetypes were profound insofar as they also differentiated tumors based upon unique immune and tumor gene-expression patterns. They also partitioned well-established classifications of tumor biology. The IPI resource provides a template for understanding cancer immunity as a collection of dominant patterns of immune organization and provides a rational path forward to learn how to modulate these to improve therapy.


Asunto(s)
Censos , Neoplasias/genética , Neoplasias/inmunología , Transcriptoma/genética , Microambiente Tumoral/inmunología , Biomarcadores de Tumor , Análisis por Conglomerados , Estudios de Cohortes , Biología Computacional/métodos , Citometría de Flujo/métodos , Regulación Neoplásica de la Expresión Génica , Humanos , Neoplasias/clasificación , Neoplasias/patología , RNA-Seq/métodos , San Francisco , Universidades
2.
Cell ; 178(6): 1465-1477.e17, 2019 09 05.
Artículo en Inglés | MEDLINE | ID: mdl-31491388

RESUMEN

Most human protein-coding genes are regulated by multiple, distinct promoters, suggesting that the choice of promoter is as important as its level of transcriptional activity. However, while a global change in transcription is recognized as a defining feature of cancer, the contribution of alternative promoters still remains largely unexplored. Here, we infer active promoters using RNA-seq data from 18,468 cancer and normal samples, demonstrating that alternative promoters are a major contributor to context-specific regulation of transcription. We find that promoters are deregulated across tissues, cancer types, and patients, affecting known cancer genes and novel candidates. For genes with independently regulated promoters, we demonstrate that promoter activity provides a more accurate predictor of patient survival than gene expression. Our study suggests that a dynamic landscape of active promoters shapes the cancer transcriptome, opening new diagnostic avenues and opportunities to further explore the interplay of regulatory mechanisms with transcriptional aberrations in cancer.


Asunto(s)
Biología Computacional/métodos , Regulación Neoplásica de la Expresión Génica/genética , Neoplasias/genética , Regiones Promotoras Genéticas/genética , Transcriptoma/genética , Bases de Datos Genéticas , Humanos , RNA-Seq/métodos
3.
Nature ; 608(7924): 733-740, 2022 08.
Artículo en Inglés | MEDLINE | ID: mdl-35978187

RESUMEN

Single-cell transcriptomics (scRNA-seq) has greatly advanced our ability to characterize cellular heterogeneity1. However, scRNA-seq requires lysing cells, which impedes further molecular or functional analyses on the same cells. Here, we established Live-seq, a single-cell transcriptome profiling approach that preserves cell viability during RNA extraction using fluidic force microscopy2,3, thus allowing to couple a cell's ground-state transcriptome to its downstream molecular or phenotypic behaviour. To benchmark Live-seq, we used cell growth, functional responses and whole-cell transcriptome read-outs to demonstrate that Live-seq can accurately stratify diverse cell types and states without inducing major cellular perturbations. As a proof of concept, we show that Live-seq can be used to directly map a cell's trajectory by sequentially profiling the transcriptomes of individual macrophages before and after lipopolysaccharide (LPS) stimulation, and of adipose stromal cells pre- and post-differentiation. In addition, we demonstrate that Live-seq can function as a transcriptomic recorder by preregistering the transcriptomes of individual macrophages that were subsequently monitored by time-lapse imaging after LPS exposure. This enabled the unsupervised, genome-wide ranking of genes on the basis of their ability to affect macrophage LPS response heterogeneity, revealing basal Nfkbia expression level and cell cycle state as important phenotypic determinants, which we experimentally validated. Thus, Live-seq can address a broad range of biological questions by transforming scRNA-seq from an end-point to a temporal analysis approach.


Asunto(s)
Supervivencia Celular , Perfilación de la Expresión Génica , Macrófagos , RNA-Seq , Análisis de la Célula Individual , Transcriptoma , Tejido Adiposo/citología , Ciclo Celular/efectos de los fármacos , Ciclo Celular/genética , Diferenciación Celular , Perfilación de la Expresión Génica/métodos , Perfilación de la Expresión Génica/normas , Genoma/efectos de los fármacos , Genoma/genética , Lipopolisacáridos/inmunología , Lipopolisacáridos/farmacología , Macrófagos/citología , Macrófagos/efectos de los fármacos , Macrófagos/inmunología , Macrófagos/metabolismo , Inhibidor NF-kappaB alfa/genética , Especificidad de Órganos , Fenotipo , ARN/genética , ARN/aislamiento & purificación , RNA-Seq/métodos , RNA-Seq/normas , Reproducibilidad de los Resultados , Análisis de Secuencia de ARN/métodos , Análisis de Secuencia de ARN/normas , Análisis de la Célula Individual/métodos , Células del Estroma/citología , Células del Estroma/metabolismo , Factores de Tiempo , Transcriptoma/genética
4.
Am J Hum Genet ; 111(7): 1282-1300, 2024 07 11.
Artículo en Inglés | MEDLINE | ID: mdl-38834072

RESUMEN

Transcriptomics is a powerful tool for unraveling the molecular effects of genetic variants and disease diagnosis. Prior studies have demonstrated that choice of genome build impacts variant interpretation and diagnostic yield for genomic analyses. To identify the extent genome build also impacts transcriptomics analyses, we studied the effect of the hg19, hg38, and CHM13 genome builds on expression quantification and outlier detection in 386 rare disease and familial control samples from both the Undiagnosed Diseases Network and Genomics Research to Elucidate the Genetics of Rare Disease Consortium. Across six routinely collected biospecimens, 61% of quantified genes were not influenced by genome build. However, we identified 1,492 genes with build-dependent quantification, 3,377 genes with build-exclusive expression, and 9,077 genes with annotation-specific expression across six routinely collected biospecimens, including 566 clinically relevant and 512 known OMIM genes. Further, we demonstrate that between builds for a given gene, a larger difference in quantification is well correlated with a larger change in expression outlier calling. Combined, we provide a database of genes impacted by build choice and recommend that transcriptomics-guided analyses and diagnoses are cross referenced with these data for robustness.


Asunto(s)
Genoma Humano , RNA-Seq , Humanos , RNA-Seq/métodos , Genómica/métodos , Transcriptoma , Enfermedades Raras/genética , Enfermedades Raras/diagnóstico , Perfilación de la Expresión Génica/métodos
5.
Am J Hum Genet ; 111(5): 841-862, 2024 05 02.
Artículo en Inglés | MEDLINE | ID: mdl-38593811

RESUMEN

RNA sequencing (RNA-seq) has recently been used in translational research settings to facilitate diagnoses of Mendelian disorders. A significant obstacle for clinical laboratories in adopting RNA-seq is the low or absent expression of a significant number of disease-associated genes/transcripts in clinically accessible samples. As this is especially problematic in neurological diseases, we developed a clinical diagnostic approach that enhanced the detection and evaluation of tissue-specific genes/transcripts through fibroblast-to-neuron cell transdifferentiation. The approach is designed specifically to suit clinical implementation, emphasizing simplicity, cost effectiveness, turnaround time, and reproducibility. For clinical validation, we generated induced neurons (iNeurons) from 71 individuals with primary neurological phenotypes recruited to the Undiagnosed Diseases Network. The overall diagnostic yield was 25.4%. Over a quarter of the diagnostic findings benefited from transdifferentiation and could not be achieved by fibroblast RNA-seq alone. This iNeuron transcriptomic approach can be effectively integrated into diagnostic whole-transcriptome evaluation of individuals with genetic disorders.


Asunto(s)
Transdiferenciación Celular , Fibroblastos , Neuronas , Análisis de Secuencia de ARN , Humanos , Transdiferenciación Celular/genética , Fibroblastos/metabolismo , Fibroblastos/citología , Análisis de Secuencia de ARN/métodos , Neuronas/metabolismo , Neuronas/citología , Transcriptoma , Reproducibilidad de los Resultados , Enfermedades del Sistema Nervioso/genética , Enfermedades del Sistema Nervioso/diagnóstico , RNA-Seq/métodos , Femenino , Masculino
6.
Genome Res ; 34(5): 769-777, 2024 06 25.
Artículo en Inglés | MEDLINE | ID: mdl-38866550

RESUMEN

Gene prediction has remained an active area of bioinformatics research for a long time. Still, gene prediction in large eukaryotic genomes presents a challenge that must be addressed by new algorithms. The amount and significance of the evidence available from transcriptomes and proteomes vary across genomes, between genes, and even along a single gene. User-friendly and accurate annotation pipelines that can cope with such data heterogeneity are needed. The previously developed annotation pipelines BRAKER1 and BRAKER2 use RNA-seq or protein data, respectively, but not both. A further significant performance improvement integrating all three data types was made by the recently released GeneMark-ETP. We here present the BRAKER3 pipeline that builds on GeneMark-ETP and AUGUSTUS, and further improves accuracy using the TSEBRA combiner. BRAKER3 annotates protein-coding genes in eukaryotic genomes using both short-read RNA-seq and a large protein database, along with statistical models learned iteratively and specifically for the target genome. We benchmarked the new pipeline on genomes of 11 species under an assumed level of relatedness of the target species proteome to available proteomes. BRAKER3 outperforms BRAKER1 and BRAKER2. The average transcript-level F1-score is increased by about 20 percentage points on average, whereas the difference is most pronounced for species with large and complex genomes. BRAKER3 also outperforms other existing tools, MAKER2, Funannotate, and FINDER. The code of BRAKER3 is available on GitHub and as a ready-to-run Docker container for execution with Docker or Singularity. Overall, BRAKER3 is an accurate, easy-to-use tool for eukaryotic genome annotation.


Asunto(s)
Anotación de Secuencia Molecular , Programas Informáticos , Anotación de Secuencia Molecular/métodos , Humanos , RNA-Seq/métodos , Algoritmos , Animales , Genoma , Biología Computacional/métodos , Genómica/métodos , Transcriptoma
7.
Genome Res ; 34(8): 1196-1210, 2024 Sep 20.
Artículo en Inglés | MEDLINE | ID: mdl-39147582

RESUMEN

Single-cell sequencing methodologies such as scRNA-seq and scATAC-seq have become widespread and effective tools to interrogate tissue composition. Increasingly, variant callers are being applied to these methodologies to resolve the genetic heterogeneity of a sample, especially in the case of detecting the clonal architecture of a tumor. Typically, traditional bulk DNA variant callers are applied to the pooled reads of a single-cell library to detect candidate mutations. Recently, multiple studies have applied such callers on reads from individual cells, with some citing the ability to detect rare variants with higher sensitivity. Many studies apply these two approaches to the Chromium (10x Genomics) scRNA-seq and scATAC-seq methodologies. However, Chromium-based libraries may offer additional challenges to variant calling compared with existing single-cell methodologies, raising questions regarding the validity of variants obtained from such a workflow. To determine the merits and challenges of various variant-calling approaches on Chromium scRNA-seq and scATAC-seq libraries, we use sample libraries with matched bulk whole-genome sequencing to evaluate the performance of callers. We review caller performance, finding that bulk callers applied on pooled reads significantly outperform individual-cell approaches. We also evaluate variants unique to scRNA-seq and scATAC-seq methodologies, finding patterns of noise but also potential capture of RNA-editing events. Finally, we review the notion that variant calling at the single-cell level can detect rare somatic variants, providing empirical results that suggest resolving such variants is infeasible in single-cell Chromium libraries.


Asunto(s)
Análisis de la Célula Individual , Análisis de la Célula Individual/métodos , Humanos , Cromo , Benchmarking , Biblioteca de Genes , Análisis de Secuencia de ARN/métodos , RNA-Seq/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Variación Genética , Análisis de Expresión Génica de una Sola Célula
8.
Genome Res ; 34(3): 484-497, 2024 04 25.
Artículo en Inglés | MEDLINE | ID: mdl-38580401

RESUMEN

Transcriptional regulation controls cellular functions through interactions between transcription factors (TFs) and their chromosomal targets. However, understanding the fate conversion potential of multiple TFs in an inducible manner remains limited. Here, we introduce iTF-seq as a method for identifying individual TFs that can alter cell fate toward specific lineages at a single-cell level. iTF-seq enables time course monitoring of transcriptome changes, and with biotinylated individual TFs, it provides a multi-omics approach to understanding the mechanisms behind TF-mediated cell fate changes. Our iTF-seq study in mouse embryonic stem cells identified multiple TFs that trigger rapid transcriptome changes indicative of differentiation within a day of induction. Moreover, cells expressing these potent TFs often show a slower cell cycle and increased cell death. Further analysis using bioChIP-seq revealed that GCM1 and OTX2 act as pioneer factors and activators by increasing gene accessibility and activating the expression of lineage specification genes during cell fate conversion. iTF-seq has utility in both mapping cell fate conversion and understanding cell fate conversion mechanisms.


Asunto(s)
Diferenciación Celular , Factores de Transcripción , Animales , Ratones , Diferenciación Celular/genética , Linaje de la Célula/genética , Perfilación de la Expresión Génica/métodos , Células Madre Embrionarias de Ratones/metabolismo , Células Madre Embrionarias de Ratones/citología , Multiómica , ARN Citoplasmático Pequeño/genética , ARN Citoplasmático Pequeño/metabolismo , RNA-Seq/métodos , Análisis de Secuencia de ARN/métodos , Análisis de Expresión Génica de una Sola Célula , Factores de Transcripción/metabolismo , Factores de Transcripción/genética , Transcriptoma
9.
Genome Res ; 34(7): 1036-1051, 2024 Aug 20.
Artículo en Inglés | MEDLINE | ID: mdl-39134412

RESUMEN

Cell identity annotation for single-cell transcriptome data is a crucial process for constructing cell atlases, unraveling pathogenesis, and inspiring therapeutic approaches. Currently, the efficacy of existing methodologies is contingent upon specific data sets. Nevertheless, such data are often sourced from various batches, sequencing technologies, tissues, and even species. Notably, the gene regulatory relationship remains unaffected by the aforementioned factors, highlighting the extensive gene interactions within organisms. Therefore, we propose scHGR, an automated annotation tool designed to leverage gene regulatory relationships in constructing gene-mediated cell communication graphs for single-cell transcriptome data. This strategy helps reduce noise from diverse data sources while establishing distant cellular connections, yielding valuable biological insights. Experiments involving 22 scenarios demonstrate that scHGR precisely and consistently annotates cell identities, benchmarked against state-of-the-art methods. Crucially, scHGR uncovers novel subtypes within peripheral blood mononuclear cells, specifically from CD4+ T cells and cytotoxic T cells. Furthermore, by characterizing a cell atlas comprising 56 cell types for COVID-19 patients, scHGR identifies vital factors like IL1 and calcium ions, offering insights for targeted therapeutic interventions.


Asunto(s)
COVID-19 , Redes Reguladoras de Genes , RNA-Seq , Análisis de Expresión Génica de una Sola Célula , Humanos , Linfocitos T CD4-Positivos/metabolismo , COVID-19/genética , COVID-19/virología , Leucocitos Mononucleares/metabolismo , Anotación de Secuencia Molecular , RNA-Seq/métodos , SARS-CoV-2/genética , Transcriptoma
10.
Nat Methods ; 21(8): 1462-1465, 2024 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-38528186

RESUMEN

Here we demonstrate that the large language model GPT-4 can accurately annotate cell types using marker gene information in single-cell RNA sequencing analysis. When evaluated across hundreds of tissue and cell types, GPT-4 generates cell type annotations exhibiting strong concordance with manual annotations. This capability can considerably reduce the effort and expertise required for cell type annotation. Additionally, we have developed an R software package GPTCelltype for GPT-4's automated cell type annotation.


Asunto(s)
Análisis de Expresión Génica de una Sola Célula , Programas Informáticos , Animales , Humanos , Ratones , Anotación de Secuencia Molecular/métodos , RNA-Seq/métodos , Análisis de Expresión Génica de una Sola Célula/métodos
11.
Nat Methods ; 21(8): 1492-1500, 2024 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-38366243

RESUMEN

Analysis of single-cell datasets generated from diverse organisms offers unprecedented opportunities to unravel fundamental evolutionary processes of conservation and diversification of cell types. However, interspecies genomic differences limit the joint analysis of cross-species datasets to homologous genes. Here we present SATURN, a deep learning method for learning universal cell embeddings that encodes genes' biological properties using protein language models. By coupling protein embeddings from language models with RNA expression, SATURN integrates datasets profiled from different species regardless of their genomic similarity. SATURN can detect functionally related genes coexpressed across species, redefining differential expression for cross-species analysis. Applying SATURN to three species whole-organism atlases and frog and zebrafish embryogenesis datasets, we show that SATURN can effectively transfer annotations across species, even when they are evolutionarily remote. We also demonstrate that SATURN can be used to find potentially divergent gene functions between glaucoma-associated genes in humans and four other species.


Asunto(s)
RNA-Seq , Análisis de Expresión Génica de una Sola Célula , Pez Cebra , Animales , Humanos , Aprendizaje Profundo , RNA-Seq/métodos , Análisis de Expresión Génica de una Sola Célula/métodos , Especificidad de la Especie , Pez Cebra/genética
12.
Nat Methods ; 21(7): 1349-1363, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38849569

RESUMEN

The Long-read RNA-Seq Genome Annotation Assessment Project Consortium was formed to evaluate the effectiveness of long-read approaches for transcriptome analysis. Using different protocols and sequencing platforms, the consortium generated over 427 million long-read sequences from complementary DNA and direct RNA datasets, encompassing human, mouse and manatee species. Developers utilized these data to address challenges in transcript isoform detection, quantification and de novo transcript detection. The study revealed that libraries with longer, more accurate sequences produce more accurate transcripts than those with increased read depth, whereas greater read depth improved quantification accuracy. In well-annotated genomes, tools based on reference sequences demonstrated the best performance. Incorporating additional orthogonal data and replicate samples is advised when aiming to detect rare and novel transcripts or using reference-free approaches. This collaborative study offers a benchmark for current practices and provides direction for future method development in transcriptome analysis.


Asunto(s)
Perfilación de la Expresión Génica , RNA-Seq , Humanos , Animales , Ratones , RNA-Seq/métodos , Perfilación de la Expresión Génica/métodos , Transcriptoma , Análisis de Secuencia de ARN/métodos , Anotación de Secuencia Molecular/métodos
13.
PLoS Biol ; 22(4): e3002605, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38687805

RESUMEN

Although sex chromosomes have evolved from autosomes, they often have unusual regulatory regimes that are sex- and cell-type-specific such as dosage compensation (DC) and meiotic sex chromosome inactivation (MSCI). The molecular mechanisms and evolutionary forces driving these unique transcriptional programs are critical for genome evolution but have been, in the case of MSCI in Drosophila, subject to continuous debate. Here, we take advantage of the younger sex chromosomes in D. miranda (XR and the neo-X) to infer how former autosomes acquire sex-chromosome-specific regulatory programs using single-cell and bulk RNA sequencing and ribosome profiling, in a comparative evolutionary context. We show that contrary to mammals and worms, the X down-regulation through germline progression is most consistent with the shutdown of DC instead of MSCI, resulting in half gene dosage at the end of meiosis for all 3 X's. Moreover, lowly expressed germline and meiotic genes on the neo-X are ancestrally lowly expressed, instead of acquired suppression after sex linkage. For the young neo-X, DC is incomplete across all tissue and cell types and this dosage imbalance is rescued by contributions from Y-linked gametologs which produce transcripts that are translated to compensate both gene and protein dosage. We find an excess of previously autosomal testis genes becoming Y-specific, showing that the neo-Y and its masculinization likely resolve sexual antagonism. Multicopy neo-sex genes are predominantly expressed during meiotic stages of spermatogenesis, consistent with their amplification being driven to interfere with mendelian segregation. Altogether, this study reveals germline regulation of evolving sex chromosomes and elucidates the consequences these unique regulatory mechanisms have on the evolution of sex chromosome architecture.


Asunto(s)
Drosophila , Células Germinativas , Meiosis , RNA-Seq , Cromosomas Sexuales , Análisis de la Célula Individual , Testículo , Animales , Masculino , Testículo/metabolismo , Cromosomas Sexuales/genética , Análisis de la Célula Individual/métodos , Células Germinativas/metabolismo , Drosophila/genética , Drosophila/metabolismo , RNA-Seq/métodos , Meiosis/genética , Compensación de Dosificación (Genética) , Evolución Molecular , Femenino , Cromosoma X/genética , Análisis de Expresión Génica de una Sola Célula
14.
Nat Rev Genet ; 22(6): 361-378, 2021 06.
Artículo en Inglés | MEDLINE | ID: mdl-33597744

RESUMEN

The human body is constantly exposed to microorganisms, which entails manifold interactions between human cells and diverse commensal or pathogenic bacteria. The cellular states of the interacting cells are decisive for the outcome of these encounters such as whether bacterial virulence programmes and host defence or tolerance mechanisms are induced. This Review summarizes how next-generation RNA sequencing (RNA-seq) has become a primary technology to study host-microbe interactions with high resolution, improving our understanding of the physiological consequences and the mechanisms at play. We illustrate how the discriminatory power and sensitivity of RNA-seq helps to dissect increasingly complex cellular interactions in time and space down to the single-cell level. We also outline how future transcriptomics may answer currently open questions in host-microbe interactions and inform treatment schemes for microbial disorders.


Asunto(s)
Bacterias/genética , Enfermedades Transmisibles/patología , Interacciones Huésped-Patógeno , RNA-Seq/métodos , Transcriptoma , Animales , Enfermedades Transmisibles/genética , Enfermedades Transmisibles/microbiología , Humanos
15.
Nature ; 597(7877): 561-565, 2021 09.
Artículo en Inglés | MEDLINE | ID: mdl-34497418

RESUMEN

Single-cell sequencing methods have enabled in-depth analysis of the diversity of cell types and cell states in a wide range of organisms. These tools focus predominantly on sequencing the genomes1, epigenomes2 and transcriptomes3 of single cells. However, despite recent progress in detecting proteins by mass spectrometry with single-cell resolution4, it remains a major challenge to measure translation in individual cells. Here, building on existing protocols5-7, we have substantially increased the sensitivity of these assays to enable ribosome profiling in single cells. Integrated with a machine learning approach, this technology achieves single-codon resolution. We validate this method by demonstrating that limitation for a particular amino acid causes ribosome pausing at a subset of the codons encoding the amino acid. Of note, this pausing is only observed in a sub-population of cells correlating to its cell cycle state. We further expand on this phenomenon in non-limiting conditions and detect pronounced GAA pausing during mitosis. Finally, we demonstrate the applicability of this technique to rare primary enteroendocrine cells. This technology provides a first step towards determining the contribution of the translational process to the remarkable diversity between seemingly identical cells.


Asunto(s)
Ciclo Celular/genética , Codón/genética , Biosíntesis de Proteínas , RNA-Seq/métodos , Ribosomas/metabolismo , Análisis de la Célula Individual , Aminoácidos/deficiencia , Aminoácidos/farmacología , Animales , Ciclo Celular/efectos de los fármacos , Línea Celular , Femenino , Humanos , Aprendizaje Automático , Masculino , Ratones , Extensión de la Cadena Peptídica de Translación , Iniciación de la Cadena Peptídica Traduccional , Terminación de la Cadena Péptídica Traduccional , Biosíntesis de Proteínas/efectos de los fármacos , Reproducibilidad de los Resultados , Ribosomas/efectos de los fármacos
16.
Proc Natl Acad Sci U S A ; 121(37): e2400002121, 2024 Sep 10.
Artículo en Inglés | MEDLINE | ID: mdl-39226348

RESUMEN

Single-cell RNA sequencing (scRNA-seq) data, susceptible to noise arising from biological variability and technical errors, can distort gene expression analysis and impact cell similarity assessments, particularly in heterogeneous populations. Current methods, including deep learning approaches, often struggle to accurately characterize cell relationships due to this inherent noise. To address these challenges, we introduce scAMF (Single-cell Analysis via Manifold Fitting), a framework designed to enhance clustering accuracy and data visualization in scRNA-seq studies. At the heart of scAMF lies the manifold fitting module, which effectively denoises scRNA-seq data by unfolding their distribution in the ambient space. This unfolding aligns the gene expression vector of each cell more closely with its underlying structure, bringing it spatially closer to other cells of the same cell type. To comprehensively assess the impact of scAMF, we compile a collection of 25 publicly available scRNA-seq datasets spanning various sequencing platforms, species, and organ types, forming an extensive RNA data bank. In our comparative studies, benchmarking scAMF against existing scRNA-seq analysis algorithms in this data bank, we consistently observe that scAMF outperforms in terms of clustering efficiency and data visualization clarity. Further experimental analysis reveals that this enhanced performance stems from scAMF's ability to improve the spatial distribution of the data and capture class-consistent neighborhoods. These findings underscore the promising application potential of manifold fitting as a tool in scRNA-seq analysis, signaling a significant enhancement in the precision and reliability of data interpretation in this critical field of study.


Asunto(s)
Análisis de la Célula Individual , Análisis de la Célula Individual/métodos , Análisis por Conglomerados , Humanos , Análisis de Secuencia de ARN/métodos , Animales , Algoritmos , ARN/genética , Perfilación de la Expresión Génica/métodos , RNA-Seq/métodos
17.
PLoS Genet ; 20(5): e1011301, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38814983

RESUMEN

Whether single-cell RNA-sequencing (scRNA-seq) captures the same biological information as single-nucleus RNA-sequencing (snRNA-seq) remains uncertain and likely to be context-dependent. Herein, a head-to-head comparison was performed in matched normal-adenocarcinoma human lung samples to assess biological insights derived from scRNA-seq versus snRNA-seq and better understand the cellular transition that occurs from normal to tumoral tissue. Here, the transcriptome of 160,621 cells/nuclei was obtained. In non-tumor lung, cell type proportions varied widely between scRNA-seq and snRNA-seq with a predominance of immune cells in the former (81.5%) and epithelial cells (69.9%) in the later. Similar results were observed in adenocarcinomas, in addition to an overall increase in cell type heterogeneity and a greater prevalence of copy number variants in cells of epithelial origin, which suggests malignant assignment. The cell type transition that occurs from normal lung tissue to adenocarcinoma was not always concordant whether cells or nuclei were examined. As expected, large differential expression of the whole-cell and nuclear transcriptome was observed, but cell-type specific changes of paired normal and tumor lung samples revealed a set of common genes in the cells and nuclei involved in cancer-related pathways. In addition, we showed that the ligand-receptor interactome landscape of lung adenocarcinoma was largely different whether cells or nuclei were evaluated. Immune cell depletion in fresh specimens partly mitigated the difference in cell type composition observed between cells and nuclei. However, the extra manipulations affected cell viability and amplified the transcriptional signatures associated with stress responses. In conclusion, research applications focussing on mapping the immune landscape of lung adenocarcinoma benefit from scRNA-seq in fresh samples, whereas snRNA-seq of frozen samples provide a low-cost alternative to profile more epithelial and cancer cells, and yield cell type proportions that more closely match tissue content.


Asunto(s)
Adenocarcinoma del Pulmón , Neoplasias Pulmonares , Análisis de Secuencia de ARN , Análisis de la Célula Individual , Humanos , Análisis de la Célula Individual/métodos , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/patología , Adenocarcinoma del Pulmón/genética , Adenocarcinoma del Pulmón/patología , Adenocarcinoma del Pulmón/inmunología , Análisis de Secuencia de ARN/métodos , Núcleo Celular/genética , Transcriptoma/genética , Regulación Neoplásica de la Expresión Génica , Pulmón/metabolismo , Pulmón/patología , Adenocarcinoma/genética , Adenocarcinoma/patología , ARN Nuclear Pequeño/genética , RNA-Seq/métodos , Perfilación de la Expresión Génica/métodos , Variaciones en el Número de Copia de ADN/genética
18.
EMBO J ; 41(2): e109221, 2022 12 17.
Artículo en Inglés | MEDLINE | ID: mdl-34918370

RESUMEN

Within a tumor, cancer cells exist in different states that are associated with distinct tumor functions, including proliferation, differentiation, invasion, metastasis, and resistance to anti-cancer therapy. The identification of the gene regulatory networks underpinning each state is essential for better understanding functional tumor heterogeneity and revealing tumor vulnerabilities. Here, we review the different studies identifying tumor states by single-cell sequencing approaches and the mechanisms that promote and sustain these functional states and regulate their transitions. We also describe how different tumor states are spatially distributed and interact with the specific stromal cells that compose the tumor microenvironment. Finally, we discuss how the understanding of tumor plasticity and transition states can be used to develop new strategies to improve cancer therapy.


Asunto(s)
Neoplasias/metabolismo , Análisis de la Célula Individual/métodos , Animales , Humanos , Neoplasias/genética , Neoplasias/patología , RNA-Seq/métodos
19.
RNA ; 30(11): 1495-1512, 2024 Oct 16.
Artículo en Inglés | MEDLINE | ID: mdl-39174298

RESUMEN

End-to-end RNA-sequencing methods that capture 5'-sequence content without cumbersome library manipulations are of great interest, particularly for analysis of long RNAs. While template-switching methods have been developed for RNA sequencing by distributive short-read RTs, such as the MMLV RTs used in SMART-Seq methods, they have not been adapted to leverage the power of ultraprocessive RTs, such as those derived from group II introns. To facilitate this transition, we dissected the individual processes that guide the enzymatic specificity and efficiency of the multistep template-switching reaction carried out by RTs, in this case, by MarathonRT. Remarkably, this is the first study of its kind, for any RT. First, we characterized the nucleotide specificity of nontemplated addition (NTA) reaction that occurs when the RT extends past the RNA 5'-terminus. We then evaluated the binding specificity of specialized template-switching oligonucleotides, optimizing their sequences and chemical properties to guide efficient template-switching reaction. Having dissected and optimized these individual steps, we then unified them into a procedure for performing RNA sequencing with MarathonRT enzymes, using a well-characterized RNA reference set. The resulting reads span a six-log range in transcript concentration and accurately represent the input RNA identities in both length and composition. We also performed RNA-seq from total human RNA and poly(A)-enriched RNA, with short- and long-read sequencing demonstrating that MarathonRT enhances the discovery of unseen RNA molecules by conventional RT. Altogether, we have generated a new pipeline for rapid, accurate sequencing of complex RNA libraries containing mixtures of long RNA transcripts.


Asunto(s)
RNA-Seq , RNA-Seq/métodos , Humanos , ARN/genética , ARN/química , ARN/metabolismo , Análisis de Secuencia de ARN/métodos , Intrones/genética
20.
Brief Bioinform ; 25(5)2024 Jul 25.
Artículo en Inglés | MEDLINE | ID: mdl-39060167

RESUMEN

Single-cell RNA sequencing (scRNA-seq) enables the exploration of biological heterogeneity among different cell types within tissues at a resolution. Inferring cell types within tissues is foundational for downstream research. Most existing methods for cell type inference based on scRNA-seq data primarily utilize highly variable genes (HVGs) with higher expression levels as clustering features, overlooking the contribution of HVGs with lower expression levels. To address this, we have designed a novel cell type inference method for scRNA-seq data, termed scLEGA. scLEGA employs a novel zero-inflated negative binomial (ZINB) loss function that fully considers the contribution of genes with lower expression levels and combines two distinct scRNA-seq clustering strategies through a multi-head attention mechanism. It utilizes a low-expression optimized denoising autoencoder, based on the novel ZINB model, to extract low-dimensional features and handle dropout events, and a GCN-based graph autoencoder (GAE) that leverages neighbor information to guide dimensionality reduction. The iterative fusion of denoising and topological embedding in scLEGA facilitates the acquisition of cluster-friendly cell representations in the hidden embedding, where similar cells are brought closer together. Compared to 12 state-of-the-art cell type inference methods on 15 scRNA-seq datasets, scLEGA demonstrates superior performance in clustering accuracy, scalability, and stability. Our scLEGA model codes are freely available at https://github.com/Masonze/scLEGA-main.


Asunto(s)
RNA-Seq , Análisis de Expresión Génica de una Sola Célula , Humanos , Algoritmos , Análisis por Conglomerados , Biología Computacional/métodos , RNA-Seq/métodos , Programas Informáticos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA