Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
1.
Hum Mol Genet ; 32(2): 218-230, 2023 01 06.
Artículo en Inglés | MEDLINE | ID: mdl-35947991

RESUMEN

DNA methylation plays a critical function in establishing and maintaining cell identity in brain. Disruption of DNA methylation-related processes leads to diverse neurological disorders. However, the role of DNA methylation characteristics in neuronal diversity remains underexplored. Here, we report detailed context-specific DNA methylation maps for GABAergic, glutamatergic (Glu) and Purkinje neurons, together with matched transcriptome profiles. Genome-wide mCH levels are distinguishable, while the mCG levels are similar among the three cell types. Substantial CG-differentially methylated regions (DMRs) are also seen, with Glu neurons experiencing substantial hypomethylation events. The relationship between mCG levels and gene expression displays cell type-specific patterns, while genic CH methylation exhibits a negative effect on transcriptional abundance. We found that cell type-specific CG-DMRs are informative in terms of represented neuronal function. Furthermore, we observed that the identified Glu-specific hypo-DMRs have a high level of consistency with the chromatin accessibility of excitatory neurons and the regions enriched for histone modifications (H3K27ac and H3K4me1) of active enhancers, suggesting their regulatory potential. Hypomethylation regions specific to each cell type are predicted to bind neuron type-specific transcription factors. Finally, we show that the DNA methylation changes in a mouse model of Rett syndrome, a neurodevelopmental disorder caused by the de novo mutations in MECP2, are cell type- and brain region-specific. Our results suggest that cell type-specific DNA methylation signatures are associated with the functional characteristics of the neuronal subtypes. The presented results emphasize the importance of DNA methylation-mediated epigenetic regulation in neuronal diversity and disease.


Asunto(s)
Epigénesis Genética , Trastornos del Neurodesarrollo , Ratones , Animales , Epigenoma , Metilación de ADN/genética , Neuronas/metabolismo , Trastornos del Neurodesarrollo/genética , Trastornos del Neurodesarrollo/metabolismo
2.
Brief Bioinform ; 22(5)2021 09 02.
Artículo en Inglés | MEDLINE | ID: mdl-33611426

RESUMEN

Cell clustering is one of the most important and commonly performed tasks in single-cell RNA sequencing (scRNA-seq) data analysis. An important step in cell clustering is to select a subset of genes (referred to as 'features'), whose expression patterns will then be used for downstream clustering. A good set of features should include the ones that distinguish different cell types, and the quality of such set could have a significant impact on the clustering accuracy. All existing scRNA-seq clustering tools include a feature selection step relying on some simple unsupervised feature selection methods, mostly based on the statistical moments of gene-wise expression distributions. In this work, we carefully evaluate the impact of feature selection on cell clustering accuracy. In addition, we develop a feature selection algorithm named FEAture SelecTion (FEAST), which provides more representative features. We apply the method on 12 public scRNA-seq datasets and demonstrate that using features selected by FEAST with existing clustering tools significantly improve the clustering accuracy.


Asunto(s)
Algoritmos , Análisis de Secuencia de ARN/estadística & datos numéricos , Análisis de la Célula Individual/métodos , Benchmarking , Análisis por Conglomerados , Conjuntos de Datos como Asunto , Perfilación de la Expresión Génica , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos
3.
Bioinformatics ; 36(19): 4860-4868, 2020 12 08.
Artículo en Inglés | MEDLINE | ID: mdl-32614380

RESUMEN

MOTIVATION: Determining the sample size for adequate power to detect statistical significance is a crucial step at the design stage for high-throughput experiments. Even though a number of methods and tools are available for sample size calculation for microarray and RNA-seq in the context of differential expression (DE), this topic in the field of single-cell RNA sequencing is understudied. Moreover, the unique data characteristics present in scRNA-seq such as sparsity and heterogeneity increase the challenge. RESULTS: We propose POWSC, a simulation-based method, to provide power evaluation and sample size recommendation for single-cell RNA-sequencing DE analysis. POWSC consists of a data simulator that creates realistic expression data, and a power assessor that provides a comprehensive evaluation and visualization of the power and sample size relationship. The data simulator in POWSC outperforms two other state-of-art simulators in capturing key characteristics of real datasets. The power assessor in POWSC provides a variety of power evaluations including stratified and marginal power analyses for DEs characterized by two forms (phase transition or magnitude tuning), under different comparison scenarios. In addition, POWSC offers information for optimizing the tradeoffs between sample size and sequencing depth with the same total reads. AVAILABILITY AND IMPLEMENTATION: POWSC is an open-source R package available online at https://github.com/suke18/POWSC. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Análisis de la Célula Individual , Programas Informáticos , Simulación por Computador , Perfilación de la Expresión Génica , RNA-Seq , Tamaño de la Muestra , Análisis de Secuencia de ARN
4.
Nat Mach Intell ; 4(11): 940-952, 2022 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-36873621

RESUMEN

CITE-seq, a single-cell multi-omics technology that measures RNA and protein expression simultaneously in single cells, has been widely applied in biomedical research, especially in immune related disorders and other diseases such as influenza and COVID-19. Despite the proliferation of CITE-seq, it is still costly to generate such data. Although data integration can increase information content, this raises computational challenges. First, combining multiple datasets is prone to batch effects that need to be addressed. Secondly, it is difficult to combine multiple CITE-seq datasets because the protein panels in different datasets may only partially overlap. Integrating multiple CITE-seq and single-cell RNA-seq (scRNA-seq) datasets is important because this allows the utilization of as many data as possible to uncover cell population heterogeneity. To overcome these challenges, we present sciPENN, a multi-use deep learning approach that supports CITE-seq and scRNA-seq data integration, protein expression prediction for scRNA-seq, protein expression imputation for CITE-seq, quantification of prediction and imputation uncertainty, and cell type label transfer from CITE-seq to scRNA-seq. Comprehensive evaluations spanning multiple datasets demonstrate that sciPENN outperforms other current state-of-the-art methods.

5.
Genome Biol ; 23(1): 270, 2022 12 27.
Artículo en Inglés | MEDLINE | ID: mdl-36575445

RESUMEN

A major question in systems biology is how to identify the core gene regulatory circuit that governs the decision-making of a biological process. Here, we develop a computational platform, named NetAct, for constructing core transcription factor regulatory networks using both transcriptomics data and literature-based transcription factor-target databases. NetAct robustly infers regulators' activity using target expression, constructs networks based on transcriptional activity, and integrates mathematical modeling for validation. Our in silico benchmark test shows that NetAct outperforms existing algorithms in inferring transcriptional activity and gene networks. We illustrate the application of NetAct to model networks driving TGF-ß-induced epithelial-mesenchymal transition and macrophage polarization.


Asunto(s)
Biología Computacional , Factores de Transcripción , Factores de Transcripción/metabolismo , Regulación de la Expresión Génica , Redes Reguladoras de Genes , Biología de Sistemas , Algoritmos
6.
Genome Biol ; 22(1): 264, 2021 09 09.
Artículo en Inglés | MEDLINE | ID: mdl-34503564

RESUMEN

BACKGROUND: Cell type identification is one of the most important questions in single-cell RNA sequencing (scRNA-seq) data analysis. With the accumulation of public scRNA-seq data, supervised cell type identification methods have gained increasing popularity due to better accuracy, robustness, and computational performance. Despite all the advantages, the performance of the supervised methods relies heavily on several key factors: feature selection, prediction method, and, most importantly, choice of the reference dataset. RESULTS: In this work, we perform extensive real data analyses to systematically evaluate these strategies in supervised cell identification. We first benchmark nine classifiers along with six feature selection strategies and investigate the impact of reference data size and number of cell types in cell type prediction. Next, we focus on how discrepancies between reference and target datasets and how data preprocessing such as imputation and batch effect correction affect prediction performance. We also investigate the strategies of pooling and purifying reference data. CONCLUSIONS: Based on our analysis results, we provide guidelines for using supervised cell typing methods. We suggest combining all individuals from available datasets to construct the reference dataset and use multi-layer perceptron (MLP) as the classifier, along with F-test as the feature selection method. All the code used for our analysis is available on GitHub ( https://github.com/marvinquiet/RefConstruction_supervisedCelltyping ).


Asunto(s)
Algoritmos , RNA-Seq , Análisis de la Célula Individual , Animales , Encéfalo/metabolismo , Bases de Datos Genéticas , Humanos , Leucocitos Mononucleares/metabolismo , Ratones , Anotación de Secuencia Molecular
7.
Front Genet ; 12: 612670, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-33897755

RESUMEN

Single cell RNA-seq data, like data from other sequencing technology, contain systematic technical noise. Such noise results from a combined effect of unequal efficiencies in the capturing and counting of mRNA molecules, such as extraction/amplification efficiency and sequencing depth. We show that such technical effects are not only cell-specific, but also affect genes differently, thus a simple cell-wise size factor adjustment may not be sufficient. We present a non-linear normalization approach that provides a cell- and gene-specific normalization factor for each gene in each cell. We show that the proposed normalization method (implemented in "SC2P" package) reduces more technical variation than competing methods, without reducing biological variation. When technical effects such as sequencing depths are not balanced between cell populations, SC2P normalization also removes the bias due to uneven technical noise. This method is applicable to scRNA-seq experiments that do not use unique molecular identifier (UMI) thus retain amplification biases.

8.
Cell Rep Methods ; 1(4)2021 08 23.
Artículo en Inglés | MEDLINE | ID: mdl-34671755

RESUMEN

Identifying biomarkers to predict the clinical outcomes of individual patients is a fundamental problem in clinical oncology. Multiple single-gene biomarkers have already been identified and used in clinics. However, multiple oncogenes or tumor-suppressor genes are involved during the process of tumorigenesis. Additionally, the efficacy of single-gene biomarkers is limited by the extensively variable expression levels measured by high-throughput assays. In this study, we hypothesize that in individual tumor samples, the disruption of transcription homeostasis in key pathways or gene sets plays an important role in tumorigenesis and has profound implications for the patient's clinical outcome. We devised a computational method named iPath to identify, at the individual-sample level, which pathways or gene sets significantly deviate from their norms. We conducted a pan-cancer analysis and demonstrated that iPath is capable of identifying highly predictive biomarkers for clinical outcomes, including overall survival, tumor subtypes, and tumor-stage classifications.


Asunto(s)
Biomarcadores de Tumor , Neoplasias , Humanos , Biomarcadores de Tumor/genética , Neoplasias/diagnóstico , Pronóstico , Carcinogénesis , Transformación Celular Neoplásica , Expresión Génica
9.
Cell Stem Cell ; 26(5): 766-781.e9, 2020 05 07.
Artículo en Inglés | MEDLINE | ID: mdl-32142682

RESUMEN

Human brain organoids provide unique platforms for modeling development and diseases by recapitulating the architecture of the embryonic brain. However, current organoid methods are limited by interior hypoxia and cell death due to insufficient surface diffusion, preventing generation of architecture resembling late developmental stages. Here, we report the sliced neocortical organoid (SNO) system, which bypasses the diffusion limit to prevent cell death over long-term cultures. This method leads to sustained neurogenesis and formation of an expanded cortical plate that establishes distinct upper and deep cortical layers for neurons and astrocytes, resembling the third trimester embryonic human neocortex. Using the SNO system, we further identify a critical role of WNT/ß-catenin signaling in regulating human cortical neuron subtype fate specification, which is disrupted by a psychiatric-disorder-associated genetic mutation in patient induced pluripotent stem cell (iPSC)-derived SNOs. These results demonstrate the utility of SNOs for investigating previously inaccessible human-specific, late-stage cortical development and disease-relevant mechanisms.


Asunto(s)
Células Madre Pluripotentes Inducidas , Neocórtex , Humanos , Neurogénesis , Neuronas , Organoides
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA