Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
1.
Hum Mol Genet ; 32(2): 218-230, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-35947991

RESUMO

DNA methylation plays a critical function in establishing and maintaining cell identity in brain. Disruption of DNA methylation-related processes leads to diverse neurological disorders. However, the role of DNA methylation characteristics in neuronal diversity remains underexplored. Here, we report detailed context-specific DNA methylation maps for GABAergic, glutamatergic (Glu) and Purkinje neurons, together with matched transcriptome profiles. Genome-wide mCH levels are distinguishable, while the mCG levels are similar among the three cell types. Substantial CG-differentially methylated regions (DMRs) are also seen, with Glu neurons experiencing substantial hypomethylation events. The relationship between mCG levels and gene expression displays cell type-specific patterns, while genic CH methylation exhibits a negative effect on transcriptional abundance. We found that cell type-specific CG-DMRs are informative in terms of represented neuronal function. Furthermore, we observed that the identified Glu-specific hypo-DMRs have a high level of consistency with the chromatin accessibility of excitatory neurons and the regions enriched for histone modifications (H3K27ac and H3K4me1) of active enhancers, suggesting their regulatory potential. Hypomethylation regions specific to each cell type are predicted to bind neuron type-specific transcription factors. Finally, we show that the DNA methylation changes in a mouse model of Rett syndrome, a neurodevelopmental disorder caused by the de novo mutations in MECP2, are cell type- and brain region-specific. Our results suggest that cell type-specific DNA methylation signatures are associated with the functional characteristics of the neuronal subtypes. The presented results emphasize the importance of DNA methylation-mediated epigenetic regulation in neuronal diversity and disease.


Assuntos
Epigênese Genética , Transtornos do Neurodesenvolvimento , Camundongos , Animais , Epigenoma , Metilação de DNA/genética , Neurônios/metabolismo , Transtornos do Neurodesenvolvimento/genética , Transtornos do Neurodesenvolvimento/metabolismo
2.
Brief Bioinform ; 22(5)2021 09 02.
Artigo em Inglês | MEDLINE | ID: mdl-33611426

RESUMO

Cell clustering is one of the most important and commonly performed tasks in single-cell RNA sequencing (scRNA-seq) data analysis. An important step in cell clustering is to select a subset of genes (referred to as 'features'), whose expression patterns will then be used for downstream clustering. A good set of features should include the ones that distinguish different cell types, and the quality of such set could have a significant impact on the clustering accuracy. All existing scRNA-seq clustering tools include a feature selection step relying on some simple unsupervised feature selection methods, mostly based on the statistical moments of gene-wise expression distributions. In this work, we carefully evaluate the impact of feature selection on cell clustering accuracy. In addition, we develop a feature selection algorithm named FEAture SelecTion (FEAST), which provides more representative features. We apply the method on 12 public scRNA-seq datasets and demonstrate that using features selected by FEAST with existing clustering tools significantly improve the clustering accuracy.


Assuntos
Algoritmos , Análise de Sequência de RNA/estatística & dados numéricos , Análise de Célula Única/métodos , Benchmarking , Análise por Conglomerados , Conjuntos de Dados como Assunto , Perfilação da Expressão Gênica , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos
3.
Bioinformatics ; 36(19): 4860-4868, 2020 12 08.
Artigo em Inglês | MEDLINE | ID: mdl-32614380

RESUMO

MOTIVATION: Determining the sample size for adequate power to detect statistical significance is a crucial step at the design stage for high-throughput experiments. Even though a number of methods and tools are available for sample size calculation for microarray and RNA-seq in the context of differential expression (DE), this topic in the field of single-cell RNA sequencing is understudied. Moreover, the unique data characteristics present in scRNA-seq such as sparsity and heterogeneity increase the challenge. RESULTS: We propose POWSC, a simulation-based method, to provide power evaluation and sample size recommendation for single-cell RNA-sequencing DE analysis. POWSC consists of a data simulator that creates realistic expression data, and a power assessor that provides a comprehensive evaluation and visualization of the power and sample size relationship. The data simulator in POWSC outperforms two other state-of-art simulators in capturing key characteristics of real datasets. The power assessor in POWSC provides a variety of power evaluations including stratified and marginal power analyses for DEs characterized by two forms (phase transition or magnitude tuning), under different comparison scenarios. In addition, POWSC offers information for optimizing the tradeoffs between sample size and sequencing depth with the same total reads. AVAILABILITY AND IMPLEMENTATION: POWSC is an open-source R package available online at https://github.com/suke18/POWSC. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Análise de Célula Única , Software , Simulação por Computador , Perfilação da Expressão Gênica , RNA-Seq , Tamanho da Amostra , Análise de Sequência de RNA
4.
Nat Mach Intell ; 4(11): 940-952, 2022 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-36873621

RESUMO

CITE-seq, a single-cell multi-omics technology that measures RNA and protein expression simultaneously in single cells, has been widely applied in biomedical research, especially in immune related disorders and other diseases such as influenza and COVID-19. Despite the proliferation of CITE-seq, it is still costly to generate such data. Although data integration can increase information content, this raises computational challenges. First, combining multiple datasets is prone to batch effects that need to be addressed. Secondly, it is difficult to combine multiple CITE-seq datasets because the protein panels in different datasets may only partially overlap. Integrating multiple CITE-seq and single-cell RNA-seq (scRNA-seq) datasets is important because this allows the utilization of as many data as possible to uncover cell population heterogeneity. To overcome these challenges, we present sciPENN, a multi-use deep learning approach that supports CITE-seq and scRNA-seq data integration, protein expression prediction for scRNA-seq, protein expression imputation for CITE-seq, quantification of prediction and imputation uncertainty, and cell type label transfer from CITE-seq to scRNA-seq. Comprehensive evaluations spanning multiple datasets demonstrate that sciPENN outperforms other current state-of-the-art methods.

5.
Genome Biol ; 23(1): 270, 2022 12 27.
Artigo em Inglês | MEDLINE | ID: mdl-36575445

RESUMO

A major question in systems biology is how to identify the core gene regulatory circuit that governs the decision-making of a biological process. Here, we develop a computational platform, named NetAct, for constructing core transcription factor regulatory networks using both transcriptomics data and literature-based transcription factor-target databases. NetAct robustly infers regulators' activity using target expression, constructs networks based on transcriptional activity, and integrates mathematical modeling for validation. Our in silico benchmark test shows that NetAct outperforms existing algorithms in inferring transcriptional activity and gene networks. We illustrate the application of NetAct to model networks driving TGF-ß-induced epithelial-mesenchymal transition and macrophage polarization.


Assuntos
Biologia Computacional , Fatores de Transcrição , Fatores de Transcrição/metabolismo , Regulação da Expressão Gênica , Redes Reguladoras de Genes , Biologia de Sistemas , Algoritmos
6.
Genome Biol ; 22(1): 264, 2021 09 09.
Artigo em Inglês | MEDLINE | ID: mdl-34503564

RESUMO

BACKGROUND: Cell type identification is one of the most important questions in single-cell RNA sequencing (scRNA-seq) data analysis. With the accumulation of public scRNA-seq data, supervised cell type identification methods have gained increasing popularity due to better accuracy, robustness, and computational performance. Despite all the advantages, the performance of the supervised methods relies heavily on several key factors: feature selection, prediction method, and, most importantly, choice of the reference dataset. RESULTS: In this work, we perform extensive real data analyses to systematically evaluate these strategies in supervised cell identification. We first benchmark nine classifiers along with six feature selection strategies and investigate the impact of reference data size and number of cell types in cell type prediction. Next, we focus on how discrepancies between reference and target datasets and how data preprocessing such as imputation and batch effect correction affect prediction performance. We also investigate the strategies of pooling and purifying reference data. CONCLUSIONS: Based on our analysis results, we provide guidelines for using supervised cell typing methods. We suggest combining all individuals from available datasets to construct the reference dataset and use multi-layer perceptron (MLP) as the classifier, along with F-test as the feature selection method. All the code used for our analysis is available on GitHub ( https://github.com/marvinquiet/RefConstruction_supervisedCelltyping ).


Assuntos
Algoritmos , RNA-Seq , Análise de Célula Única , Animais , Encéfalo/metabolismo , Bases de Dados Genéticas , Humanos , Leucócitos Mononucleares/metabolismo , Camundongos , Anotação de Sequência Molecular
7.
Front Genet ; 12: 612670, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33897755

RESUMO

Single cell RNA-seq data, like data from other sequencing technology, contain systematic technical noise. Such noise results from a combined effect of unequal efficiencies in the capturing and counting of mRNA molecules, such as extraction/amplification efficiency and sequencing depth. We show that such technical effects are not only cell-specific, but also affect genes differently, thus a simple cell-wise size factor adjustment may not be sufficient. We present a non-linear normalization approach that provides a cell- and gene-specific normalization factor for each gene in each cell. We show that the proposed normalization method (implemented in "SC2P" package) reduces more technical variation than competing methods, without reducing biological variation. When technical effects such as sequencing depths are not balanced between cell populations, SC2P normalization also removes the bias due to uneven technical noise. This method is applicable to scRNA-seq experiments that do not use unique molecular identifier (UMI) thus retain amplification biases.

8.
Cell Rep Methods ; 1(4)2021 08 23.
Artigo em Inglês | MEDLINE | ID: mdl-34671755

RESUMO

Identifying biomarkers to predict the clinical outcomes of individual patients is a fundamental problem in clinical oncology. Multiple single-gene biomarkers have already been identified and used in clinics. However, multiple oncogenes or tumor-suppressor genes are involved during the process of tumorigenesis. Additionally, the efficacy of single-gene biomarkers is limited by the extensively variable expression levels measured by high-throughput assays. In this study, we hypothesize that in individual tumor samples, the disruption of transcription homeostasis in key pathways or gene sets plays an important role in tumorigenesis and has profound implications for the patient's clinical outcome. We devised a computational method named iPath to identify, at the individual-sample level, which pathways or gene sets significantly deviate from their norms. We conducted a pan-cancer analysis and demonstrated that iPath is capable of identifying highly predictive biomarkers for clinical outcomes, including overall survival, tumor subtypes, and tumor-stage classifications.


Assuntos
Biomarcadores Tumorais , Neoplasias , Humanos , Biomarcadores Tumorais/genética , Neoplasias/diagnóstico , Prognóstico , Carcinogênese , Transformação Celular Neoplásica , Expressão Gênica
9.
Cell Stem Cell ; 26(5): 766-781.e9, 2020 05 07.
Artigo em Inglês | MEDLINE | ID: mdl-32142682

RESUMO

Human brain organoids provide unique platforms for modeling development and diseases by recapitulating the architecture of the embryonic brain. However, current organoid methods are limited by interior hypoxia and cell death due to insufficient surface diffusion, preventing generation of architecture resembling late developmental stages. Here, we report the sliced neocortical organoid (SNO) system, which bypasses the diffusion limit to prevent cell death over long-term cultures. This method leads to sustained neurogenesis and formation of an expanded cortical plate that establishes distinct upper and deep cortical layers for neurons and astrocytes, resembling the third trimester embryonic human neocortex. Using the SNO system, we further identify a critical role of WNT/ß-catenin signaling in regulating human cortical neuron subtype fate specification, which is disrupted by a psychiatric-disorder-associated genetic mutation in patient induced pluripotent stem cell (iPSC)-derived SNOs. These results demonstrate the utility of SNOs for investigating previously inaccessible human-specific, late-stage cortical development and disease-relevant mechanisms.


Assuntos
Células-Tronco Pluripotentes Induzidas , Neocórtex , Humanos , Neurogênese , Neurônios , Organoides
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA