Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Brief Bioinform ; 25(4)2024 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-38975891

RESUMO

Unsupervised feature selection is a critical step for efficient and accurate analysis of single-cell RNA-seq data. Previous benchmarks used two different criteria to compare feature selection methods: (i) proportion of ground-truth marker genes included in the selected features and (ii) accuracy of cell clustering using ground-truth cell types. Here, we systematically compare the performance of 11 feature selection methods for both criteria. We first demonstrate the discordance between these criteria and suggest using the latter. We then compare the distribution of selected genes in their means between feature selection methods. We show that lowly expressed genes exhibit seriously high coefficients of variation and are mostly excluded by high-performance methods. In particular, high-deviation- and high-expression-based methods outperform the widely used in Seurat package in clustering cells and data visualization. We further show they also enable a clear separation of the same cell type from different tissues as well as accurate estimation of cell trajectories.


Assuntos
Análise de Célula Única , Análise de Célula Única/métodos , Análise por Conglomerados , Humanos , Perfilação da Expressão Gênica/métodos , Algoritmos , Biologia Computacional/métodos , Análise de Sequência de RNA/métodos , RNA-Seq/métodos
2.
Nucleic Acids Res ; 47(9): e53, 2019 05 21.
Artigo em Inglês | MEDLINE | ID: mdl-30820547

RESUMO

We present a novel approach to identify human microRNA (miRNA) regulatory modules (mRNA targets and relevant cell conditions) by biclustering a large collection of mRNA fold-change data for sequence-specific targets. Bicluster targets were assessed using validated messenger RNA (mRNA) targets and exhibited on an average 17.0% (median 19.4%) improved gain in certainty (sensitivity + specificity). The net gain was further increased up to 32.0% (median 33.4%) by incorporating functional networks of targets. We analyzed cancer-specific biclusters and found that the PI3K/Akt signaling pathway is strongly enriched with targets of a few miRNAs in breast cancer and diffuse large B-cell lymphoma. Indeed, five independent prognostic miRNAs were identified, and repression of bicluster targets and pathway activity by miR-29 was experimentally validated. In total, 29 898 biclusters for 459 human miRNAs were collected in the BiMIR database where biclusters are searchable for miRNAs, tissues, diseases, keywords and target genes.


Assuntos
Big Data , Perfilação da Expressão Gênica/métodos , Redes Reguladoras de Genes/genética , MicroRNAs/genética , Neoplasias da Mama/genética , Neoplasias da Mama/patologia , Bases de Dados Genéticas , Feminino , Regulação Neoplásica da Expressão Gênica/genética , Humanos , Linfoma Difuso de Grandes Células B/genética , Linfoma Difuso de Grandes Células B/patologia , Fosfatidilinositol 3-Quinases/genética , Prognóstico , Proteínas Proto-Oncogênicas c-akt/genética , Transdução de Sinais/genética , Transcriptoma/genética
3.
Nucleic Acids Res ; 46(10): e60, 2018 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-29562348

RESUMO

Pathway-based analysis in genome-wide association study (GWAS) is being widely used to uncover novel multi-genic functional associations. Many of these pathway-based methods have been used to test the enrichment of the associated genes in the pathways, but exhibited low powers and were highly affected by free parameters. We present the novel method and software GSA-SNP2 for pathway enrichment analysis of GWAS P-value data. GSA-SNP2 provides high power, decent type I error control and fast computation by incorporating the random set model and SNP-count adjusted gene score. In a comparative study using simulated and real GWAS data, GSA-SNP2 exhibited high power and best prioritized gold standard positive pathways compared with six existing enrichment-based methods and two self-contained methods (alternative pathway analysis approach). Based on these results, the difference between pathway analysis approaches was investigated and the effects of the gene correlation structures on the pathway enrichment analysis were also discussed. In addition, GSA-SNP2 is able to visualize protein interaction networks within and across the significant pathways so that the user can prioritize the core subnetworks for further studies. GSA-SNP2 is freely available at https://sourceforge.net/projects/gsasnp2.


Assuntos
Estudo de Associação Genômica Ampla/métodos , Software , Povo Asiático/genética , Estatura/genética , Bases de Dados Genéticas , Diabetes Mellitus Tipo 2/genética , Humanos , Polimorfismo de Nucleotídeo Único , Linguagens de Programação , Mapas de Interação de Proteínas
4.
Nat Commun ; 14(1): 1570, 2023 03 21.
Artigo em Inglês | MEDLINE | ID: mdl-36944632

RESUMO

Integration of single-cell RNA sequencing data between different samples has been a major challenge for analyzing cell populations. However, strategies to integrate differential expression analysis of single-cell data remain underinvestigated. Here, we benchmark 46 workflows for differential expression analysis of single-cell data with multiple batches. We show that batch effects, sequencing depth and data sparsity substantially impact their performances. Notably, we find that the use of batch-corrected data rarely improves the analysis for sparse data, whereas batch covariate modeling improves the analysis for substantial batch effects. We show that for low depth data, single-cell techniques based on zero-inflation model deteriorate the performance, whereas the analysis of uncorrected data using limmatrend, Wilcoxon test and fixed effects model performs well. We suggest several high-performance methods under different conditions based on various simulation and real data analyses. Additionally, we demonstrate that differential expression analysis for a specific cell type outperforms that of large-scale bulk sample data in prioritizing disease-related genes.


Assuntos
Benchmarking , Análise de Dados , Análise de Sequência de RNA/métodos , Benchmarking/métodos , Simulação por Computador , Fluxo de Trabalho , Análise de Célula Única/métodos , Perfilação da Expressão Gênica/métodos
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa