RESUMO
Immunotherapy is a promising treatment for triple-negative breast cancer (TNBC), but patients relapse, highlighting the need to understand the mechanisms of resistance. We discovered that in primary breast cancer, tumor cells that resist T cell attack are quiescent. Quiescent cancer cells (QCCs) form clusters with reduced immune infiltration. They also display superior tumorigenic capacity and higher expression of chemotherapy resistance and stemness genes. We adapted single-cell RNA-sequencing with precise spatial resolution to profile infiltrating cells inside and outside the QCC niche. This transcriptomic analysis revealed hypoxia-induced programs and identified more exhausted T cells, tumor-protective fibroblasts, and dysfunctional dendritic cells inside clusters of QCCs. This uncovered differential phenotypes in infiltrating cells based on their intra-tumor location. Thus, QCCs constitute immunotherapy-resistant reservoirs by orchestrating a local hypoxic immune-suppressive milieu that blocks T cell function. Eliminating QCCs holds the promise to counteract immunotherapy resistance and prevent disease recurrence in TNBC.
Assuntos
Neoplasias de Mama Triplo Negativas , Humanos , Imunossupressores/uso terapêutico , Imunoterapia , Recidiva Local de Neoplasia , Linfócitos T/patologia , Neoplasias de Mama Triplo Negativas/patologia , Microambiente TumoralRESUMO
Tissues are exposed to diverse inflammatory challenges that shape future inflammatory responses. While cellular metabolism regulates immune function, how metabolism programs and stabilizes immune states within tissues and tunes susceptibility to inflammation is poorly understood. Here, we describe an innate immune metabolic switch that programs long-term intestinal tolerance. Intestinal interleukin-18 (IL-18) stimulation elicited tolerogenic macrophages by preventing their proinflammatory glycolytic polarization via metabolic reprogramming to fatty acid oxidation (FAO). FAO reprogramming was triggered by IL-18 activation of SLC12A3 (NCC), leading to sodium influx, release of mitochondrial DNA, and activation of stimulator of interferon genes (STING). FAO was maintained in macrophages by a bistable switch that encoded memory of IL-18 stimulation and by intercellular positive feedback that sustained the production of macrophage-derived 2'3'-cyclic GMP-AMP (cGAMP) and epithelial-derived IL-18. Thus, a tissue-reinforced metabolic switch encodes durable immune tolerance in the gut and may enable reconstructing compromised immune tolerance in chronic inflammation.
Assuntos
Tolerância Imunológica , Interleucina-18 , Macrófagos , Nucleotídeos Cíclicos , Interleucina-18/metabolismo , Interleucina-18/imunologia , Animais , Camundongos , Nucleotídeos Cíclicos/metabolismo , Macrófagos/imunologia , Macrófagos/metabolismo , Humanos , Camundongos Endogâmicos C57BL , Mucosa Intestinal/imunologia , Mucosa Intestinal/metabolismo , Camundongos Knockout , Ácidos Graxos/metabolismo , Intestinos/imunologia , Imunidade Inata , Inflamação/imunologia , Inflamação/metabolismo , Glicólise , OxirreduçãoRESUMO
During typesetting of this article, errors were inadvertently introduced to the hyperlinked URLs of some of the clustering tools in table 1 (Seurat, CIDR, pcaReduce and mpath), as well as to the numbering of the bold-text annotations in the reference list. The article has now been corrected online. The editors apologize for this error.
RESUMO
Single-cell RNA sequencing (scRNA-seq) allows researchers to collect large catalogues detailing the transcriptomes of individual cells. Unsupervised clustering is of central importance for the analysis of these data, as it is used to identify putative cell types. However, there are many challenges involved. We discuss why clustering is a challenging problem from a computational point of view and what aspects of the data make it challenging. We also consider the difficulties related to the biological interpretation and annotation of the identified clusters.
Assuntos
Linhagem da Célula/genética , Biologia Computacional/métodos , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , RNA Mensageiro/genética , Análise de Célula Única/estatística & dados numéricos , Transcriptoma , Análise por Conglomerados , Epigênese Genética , Células Eucarióticas/classificação , Células Eucarióticas/citologia , Células Eucarióticas/metabolismo , Perfilação da Expressão Gênica , Humanos , RNA Mensageiro/química , RNA Mensageiro/metabolismo , Análise de Célula Única/métodos , Aprendizado de Máquina não SupervisionadoRESUMO
A major challenge in single-cell biology is identifying cell-type-specific gene functions, which may substantially improve precision medicine. Differential expression analysis of genes is a popular, yet insufficient approach, and complementary methods that associate function with cell type are required. Here, we describe scHumanNet (https://github.com/netbiolab/scHumanNet), a single-cell network analysis platform for resolving cellular heterogeneity across gene functions in humans. Based on cell-type-specific gene networks (CGNs) constructed under the guidance of the HumanNet reference interactome, scHumanNet displayed higher functional relevance to the cellular context than CGNs built by other methods on single-cell transcriptome data. Cellular deconvolution of gene signatures based on network compactness across cell types revealed breast cancer prognostic markers associated with T cells. scHumanNet could also prioritize genes associated with particular cell types using CGN centrality and identified the differential hubness of CGNs between disease and healthy conditions. We demonstrated the usefulness of scHumanNet by uncovering T-cell-specific functional effects of GITR, a prognostic gene for breast cancer, and functional defects in autism spectrum disorder genes specific for inhibitory neurons. These results suggest that scHumanNet will advance our understanding of cell-type specificity across human disease genes.
Assuntos
Análise de Célula Única , Feminino , Humanos , Transtorno do Espectro Autista/genética , Neoplasias da Mama/genética , Perfilação da Expressão Gênica/métodos , Redes Reguladoras de Genes , Linfócitos T , Transcriptoma , SoftwareRESUMO
BACKGROUND: Regulation of transcription is central to the emergence of new cell types during development, and it often involves activation of genes via proximal and distal regulatory regions. The activity of regulatory elements is determined by transcription factors (TFs) and epigenetic marks, but despite extensive mapping of such patterns, the extraction of regulatory principles remains challenging. RESULTS: Here we study differentially and similarly expressed genes along with their associated epigenomic profiles, chromatin accessibility and DNA methylation, during lineage specification at gastrulation in mice. Comparison of the three lineages allows us to identify genomic and epigenomic features that distinguish the two classes of genes. We show that differentially expressed genes are primarily regulated by distal elements, while similarly expressed genes are controlled by proximal housekeeping regulatory programs. Differentially expressed genes are relatively isolated within topologically associated domains, while similarly expressed genes tend to be located in gene clusters. Transcription of differentially expressed genes is associated with differentially open chromatin at distal elements including enhancers, while that of similarly expressed genes is associated with ubiquitously accessible chromatin at promoters. CONCLUSION: Based on these associations of (linearly) distal genes' transcription start sites (TSSs) and putative enhancers for developmental genes, our findings allow us to link putative enhancers to their target promoters and to infer lineage-specific repertoires of putative driver transcription factors, within which we define subgroups of pioneers and co-operators.
Assuntos
Epigenômica , Genes Essenciais , Animais , Camundongos , Cromatina/genética , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Perfilação da Expressão GênicaRESUMO
Single-cell technologies have made it possible to profile millions of cells, but for these resources to be useful they must be easy to query and access. To facilitate interactive and intuitive access to single-cell data we have developed scfind, a single-cell analysis tool that facilitates fast search of biologically or clinically relevant marker genes in cell atlases. Using transcriptome data from six mouse cell atlases, we show how scfind can be used to evaluate marker genes, perform in silico gating, and identify both cell-type-specific and housekeeping genes. Moreover, we have developed a subquery optimization routine to ensure that long and complex queries return meaningful results. To make scfind more user friendly, we use indices of PubMed abstracts and techniques from natural language processing to allow for arbitrary queries. Finally, we show how scfind can be used for multi-omics analyses by combining single-cell ATAC-seq data with transcriptome data.
Assuntos
Gerenciamento de Dados/métodos , Armazenamento e Recuperação da Informação/métodos , Análise de Célula Única/métodos , Transcriptoma/genética , Algoritmos , Animais , Análise de Dados , Bases de Dados Genéticas , Regulação da Expressão Gênica , Camundongos , Processamento de Linguagem Natural , PubMed , Interface Usuário-ComputadorRESUMO
Most genomes harbor a large number of transposons, and they play an important role in evolution and gene regulation. They are also of interest to clinicians as they are involved in several diseases, including cancer and neurodegeneration. Although several methods for transposon identification are available, they are often highly specialised towards specific tasks or classes of transposons, and they lack common standards such as a unified taxonomy scheme and output file format. We present TransposonUltimate, a powerful bundle of three modules for transposon classification, annotation, and detection of transposition events. TransposonUltimate comes as a Conda package under the GPL-3.0 licence, is well documented and it is easy to install through https://github.com/DerKevinRiehl/TransposonUltimate. We benchmark the classification module on the large TransposonDB covering 891,051 sequences to demonstrate that it outperforms the currently best existing solutions. The annotation and detection modules combine sixteen existing softwares, and we illustrate its use by annotating Caenorhabditis elegans, Rhizophagus irregularis and Oryza sativa subs. japonica genomes. Finally, we use the detection module to discover 29 554 transposition events in the genomes of 20 wild type strains of C. elegans. Databases, assemblies, annotations and further findings can be downloaded from (https://doi.org/10.5281/zenodo.5518085).
Assuntos
Elementos de DNA Transponíveis , Software , Animais , Benchmarking , Caenorhabditis elegans/genética , Fungos/genética , Genoma , Anotação de Sequência Molecular , Oryza/genética , Padrões de ReferênciaRESUMO
Methods to deconvolve single-cell RNA-sequencing (scRNA-seq) data are necessary for samples containing a mixture of genotypes, whether they are natural or experimentally combined. Multiplexing across donors is a popular experimental design that can avoid batch effects, reduce costs and improve doublet detection. By using variants detected in scRNA-seq reads, it is possible to assign cells to their donor of origin and identify cross-genotype doublets that may have highly similar transcriptional profiles, precluding detection by transcriptional profile. More subtle cross-genotype variant contamination can be used to estimate the amount of ambient RNA. Ambient RNA is caused by cell lysis before droplet partitioning and is an important confounder of scRNA-seq analysis. Here we develop souporcell, a method to cluster cells using the genetic variants detected within the scRNA-seq reads. We show that it achieves high accuracy on genotype clustering, doublet detection and ambient RNA estimation, as demonstrated across a range of challenging scenarios.
Assuntos
RNA-Seq/métodos , RNA/genética , Análise de Célula Única/métodos , Algoritmos , Sequência de Bases , Linhagem Celular , Análise por Conglomerados , Genótipo , Humanos , Polimorfismo de Nucleotídeo Único , Sensibilidade e Especificidade , SoftwareRESUMO
As the cost of single-cell RNA-seq experiments has decreased, an increasing number of datasets are now available. Combining newly generated and publicly accessible datasets is challenging due to non-biological signals, commonly known as batch effects. Although there are several computational methods available that can remove batch effects, evaluating which method performs best is not straightforward. Here, we present BatchBench (https://github.com/cellgeni/batchbench), a modular and flexible pipeline for comparing batch correction methods for single-cell RNA-seq data. We apply BatchBench to eight methods, highlighting their methodological differences and assess their performance and computational requirements through a compendium of well-studied datasets. This systematic comparison guides users in the choice of batch correction tool, and the pipeline makes it easy to evaluate other datasets.
Assuntos
RNA-Seq/métodos , Análise de Célula Única/métodos , Software , Animais , Conjuntos de Dados como Assunto , Humanos , CamundongosRESUMO
DNA strand asymmetries can have a major effect on several biological functions, including replication, transcription and transcription factor binding. As such, DNA strand asymmetries and mutational strand bias can provide information about biological function. However, a versatile tool to explore this does not exist. Here, we present Asymmetron, a user-friendly computational tool that performs statistical analysis and visualizations for the evaluation of strand asymmetries. Asymmetron takes as input DNA features provided with strand annotation and outputs strand asymmetries for consecutive occurrences of a single DNA feature or between pairs of features. We illustrate the use of Asymmetron by identifying transcriptional and replicative strand asymmetries of germline structural variant breakpoints. We also show that the orientation of the binding sites of 45% of human transcription factors analyzed have a significant DNA strand bias in transcribed regions, that is also corroborated in ChIP-seq analyses, and is likely associated with transcription. In summary, we provide a novel tool to assess DNA strand asymmetries and show how it can be used to derive new insights across a variety of biological disciplines.
Assuntos
Biologia Computacional/métodos , Replicação do DNA/genética , DNA/genética , Mutação , Transcrição Gênica/genética , Células A549 , Algoritmos , Linhagem Celular Transformada , DNA/química , DNA/metabolismo , Células Hep G2 , Humanos , Células K562 , Células MCF-7 , Modelos Genéticos , Ligação Proteica , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismoRESUMO
BACKGROUND: Today it is possible to profile the transcriptome of individual cells, and a key step in the analysis of these datasets is unsupervised clustering. For very large datasets, efficient algorithms are required to ensure that analyses can be conducted with reasonable time and memory requirements. RESULTS: Here, we present a highly efficient k-means based approach, and we demonstrate that it scales favorably with the number of cells with regards to time and memory. CONCLUSIONS: We have demonstrated that our streaming k-means clustering algorithm gives state-of-the-art performance while resource requirements scale favorably for up to 2 million cells.
Assuntos
Algoritmos , Transcriptoma , Análise por ConglomeradosRESUMO
Somatic mutations show variation in density across cancer genomes. Previous studies have shown that chromatin organization and replication time domains are correlated with, and thus predictive of, this variation. Here, we analyze 1809 whole-genome sequences from 10 cancer types to show that a subset of repetitive DNA sequences, called non-B motifs that predict noncanonical secondary structure formation can independently account for variation in mutation density. Combined with epigenetic factors and replication timing, the variance explained can be improved to 43%-76%. Approximately twofold mutation enrichment is observed directly within non-B motifs, is focused on exposed structural components, and is dependent on physical properties that are optimal for secondary structure formation. Therefore, there is mounting evidence that secondary structures arising from non-B motifs are not simply associated with increased mutation density-they are possibly causally implicated. Our results suggest that they are determinants of mutagenesis and increase the likelihood of recurrent mutations in the genome. This analysis calls for caution in the interpretation of recurrent mutations and highlights the importance of taking non-B motifs that can simply be inferred from the reference sequence into consideration in background models of mutability henceforth.
Assuntos
Mutagênese , Neoplasias/genética , Motivos de Nucleotídeos , DNA de Forma B/química , DNA de Forma B/genética , HumanosRESUMO
Single-cell RNA-seq (scRNA-seq) allows researchers to define cell types on the basis of unsupervised clustering of the transcriptome. However, differences in experimental methods and computational analyses make it challenging to compare data across experiments. Here we present scmap (http://bioconductor.org/packages/scmap; web version at http://www.sanger.ac.uk/science/tools/scmap), a method for projecting cells from an scRNA-seq data set onto cell types or individual cells from other experiments.
Assuntos
Perfilação da Expressão Gênica/métodos , Regulação da Expressão Gênica/fisiologia , Análise de Célula Única , Software , TranscriptomaRESUMO
Disruption of the MECP2 gene leads to Rett syndrome (RTT), a severe neurological disorder with features of autism. MECP2 encodes a methyl-DNA-binding protein that has been proposed to function as a transcriptional repressor, but despite numerous mouse studies examining neuronal gene expression in Mecp2 mutants, no clear model has emerged for how MeCP2 protein regulates transcription. Here we identify a genome-wide length-dependent increase in gene expression in MeCP2 mutant mouse models and human RTT brains. We present evidence that MeCP2 represses gene expression by binding to methylated CA sites within long genes, and that in neurons lacking MeCP2, decreasing the expression of long genes attenuates RTT-associated cellular deficits. In addition, we find that long genes as a population are enriched for neuronal functions and selectively expressed in the brain. These findings suggest that mutations in MeCP2 may cause neurological dysfunction by specifically disrupting long gene expression in the brain.
Assuntos
Metilação de DNA/genética , Proteína 2 de Ligação a Metil-CpG/genética , Proteína 2 de Ligação a Metil-CpG/metabolismo , Mutação/genética , Síndrome de Rett/genética , Animais , Sequência de Bases , Encéfalo/metabolismo , DNA (Citosina-5-)-Metiltransferases/metabolismo , DNA Metiltransferase 3A , Modelos Animais de Doenças , Feminino , Regulação da Expressão Gênica , Humanos , Masculino , Proteína 2 de Ligação a Metil-CpG/deficiência , Camundongos , Dados de Sequência Molecular , Neurônios/metabolismoRESUMO
Single-cell RNA-seq enables the quantitative characterization of cell types based on global transcriptome profiles. We present single-cell consensus clustering (SC3), a user-friendly tool for unsupervised clustering, which achieves high accuracy and robustness by combining multiple clustering solutions through a consensus approach (http://bioconductor.org/packages/SC3). We demonstrate that SC3 is capable of identifying subclones from the transcriptomes of neoplastic cells collected from patients.
Assuntos
Perfilação da Expressão Gênica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Análise por Conglomerados , Conjuntos de Dados como Assunto , Células-Tronco Hematopoéticas/citologia , Humanos , Máquina de Vetores de SuporteRESUMO
MOTIVATION: Most genomes contain thousands of genes, but for most functional responses, only a subset of those genes are relevant. To facilitate many single-cell RNASeq (scRNASeq) analyses the set of genes is often reduced through feature selection, i.e. by removing genes only subject to technical noise. RESULTS: We present M3Drop, an R package that implements popular existing feature selection methods and two novel methods which take advantage of the prevalence of zeros (dropouts) in scRNASeq data to identify features. We show these new methods outperform existing methods on simulated and real datasets. AVAILABILITY AND IMPLEMENTATION: M3Drop is freely available on github as an R package and is compatible with other popular scRNASeq tools: https://github.com/tallulandrews/M3Drop. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Software , Genoma , Análise de Sequência de RNA , Análise de Célula ÚnicaRESUMO
Promoters initiate RNA synthesis, and enhancers stimulate promoter activity. Whether promoter and enhancer activities are encoded distinctly in DNA sequences is unknown. We measured the enhancer and promoter activities of thousands of DNA fragments transduced into mouse neurons. We focused on genomic loci bound by the neuronal activity-regulated coactivator CREBBP, and we measured enhancer and promoter activities both before and after neuronal activation. We find that the same sequences typically encode both enhancer and promoter activities. However, gene promoters generate more promoter activity than distal enhancers, despite generating similar enhancer activity. Surprisingly, the greater promoter activity of gene promoters is not due to conventional core promoter elements or splicing signals. Instead, we find that particular transcription factor binding motifs are intrinsically biased toward the generation of promoter activity, whereas others are not. Although the specific biases we observe may be dependent on experimental or cellular context, our results suggest that gene promoters are distinguished from distal enhancers by specific complements of transcriptional activators.
Assuntos
Proteína de Ligação a CREB/genética , Elementos Facilitadores Genéticos , Regiões Promotoras Genéticas , Transcrição Gênica , Animais , Sítios de Ligação , Cromatina/genética , Proteínas de Ligação a DNA/genética , Camundongos , Neurônios/metabolismo , Ligação Proteica , Análise de Sequência de DNARESUMO
MOTIVATION: With the rapid advances in DNA synthesis and sequencing technologies and the continuing decline in the associated costs, high-throughput experiments can be performed to investigate the regulatory role of thousands of oligonucleotide sequences simultaneously. Nevertheless, designing high-throughput reporter assay experiments such as massively parallel reporter assays (MPRAs) and similar methods remains challenging. RESULTS: We introduce MPRAnator, a set of tools that facilitate rapid design of MPRA experiments. With MPRA Motif design, a set of variables provides fine control of how motifs are placed into sequences, thereby allowing the investigation of the rules that govern transcription factor (TF) occupancy. MPRA single-nucleotide polymorphism design can be used to systematically examine the functional effects of single or combinations of single-nucleotide polymorphisms at regulatory sequences. Finally, the Transmutation tool allows for the design of negative controls by permitting scrambling, reversing, complementing or introducing multiple random mutations in the input sequences or motifs. AVAILABILITY AND IMPLEMENTATION: MPRAnator tool set is implemented in Python, Perl and Javascript and is freely available at www.genomegeek.com and www.sanger.ac.uk/science/tools/mpranator The source code is available on www.github.com/hemberg-lab/MPRAnator/ under the MIT license. The REST API allows programmatic access to MPRAnator using simple URLs. CONTACT: igs@sanger.ac.uk or mh26@sanger.ac.ukSupplementary information: Supplementary data are available at Bioinformatics online.
Assuntos
Redes Reguladoras de Genes , Genes Reporter , Ensaios de Triagem em Larga Escala/métodos , Software , Fatores de Transcrição/metabolismo , DNA/metabolismo , Internet , Polimorfismo de Nucleotídeo Único , Projetos de PesquisaRESUMO
The ability to integrate 'omics' (i.e. transcriptomics and proteomics) is becoming increasingly important to the understanding of regulatory mechanisms. There are currently no tools available to identify differentially expressed genes (DEGs) across different 'omics' data types or multi-dimensional data including time courses. We present fCI (f-divergence Cut-out Index), a model capable of simultaneously identifying DEGs from continuous and discrete transcriptomic, proteomic and integrated proteogenomic data. We show that fCI can be used across multiple diverse sets of data and can unambiguously find genes that show functional modulation, developmental changes or misregulation. Applying fCI to several proteogenomics datasets, we identified a number of important genes that showed distinctive regulation patterns. The package fCI is available at R Bioconductor and http://software.steenlab.org/fCI/.