Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Más filtros

Banco de datos
Tipo de estudio
Tipo del documento
Intervalo de año de publicación
1.
Brief Bioinform ; 25(4)2024 May 23.
Artículo en Inglés | MEDLINE | ID: mdl-38975891

RESUMEN

Unsupervised feature selection is a critical step for efficient and accurate analysis of single-cell RNA-seq data. Previous benchmarks used two different criteria to compare feature selection methods: (i) proportion of ground-truth marker genes included in the selected features and (ii) accuracy of cell clustering using ground-truth cell types. Here, we systematically compare the performance of 11 feature selection methods for both criteria. We first demonstrate the discordance between these criteria and suggest using the latter. We then compare the distribution of selected genes in their means between feature selection methods. We show that lowly expressed genes exhibit seriously high coefficients of variation and are mostly excluded by high-performance methods. In particular, high-deviation- and high-expression-based methods outperform the widely used in Seurat package in clustering cells and data visualization. We further show they also enable a clear separation of the same cell type from different tissues as well as accurate estimation of cell trajectories.


Asunto(s)
Análisis de la Célula Individual , Análisis de la Célula Individual/métodos , Análisis por Conglomerados , Humanos , Perfilación de la Expresión Génica/métodos , Algoritmos , Biología Computacional/métodos , Análisis de Secuencia de ARN/métodos , RNA-Seq/métodos
2.
Nucleic Acids Res ; 47(9): e53, 2019 05 21.
Artículo en Inglés | MEDLINE | ID: mdl-30820547

RESUMEN

We present a novel approach to identify human microRNA (miRNA) regulatory modules (mRNA targets and relevant cell conditions) by biclustering a large collection of mRNA fold-change data for sequence-specific targets. Bicluster targets were assessed using validated messenger RNA (mRNA) targets and exhibited on an average 17.0% (median 19.4%) improved gain in certainty (sensitivity + specificity). The net gain was further increased up to 32.0% (median 33.4%) by incorporating functional networks of targets. We analyzed cancer-specific biclusters and found that the PI3K/Akt signaling pathway is strongly enriched with targets of a few miRNAs in breast cancer and diffuse large B-cell lymphoma. Indeed, five independent prognostic miRNAs were identified, and repression of bicluster targets and pathway activity by miR-29 was experimentally validated. In total, 29 898 biclusters for 459 human miRNAs were collected in the BiMIR database where biclusters are searchable for miRNAs, tissues, diseases, keywords and target genes.


Asunto(s)
Macrodatos , Perfilación de la Expresión Génica/métodos , Redes Reguladoras de Genes/genética , MicroARNs/genética , Neoplasias de la Mama/genética , Neoplasias de la Mama/patología , Bases de Datos Genéticas , Femenino , Regulación Neoplásica de la Expresión Génica/genética , Humanos , Linfoma de Células B Grandes Difuso/genética , Linfoma de Células B Grandes Difuso/patología , Fosfatidilinositol 3-Quinasas/genética , Pronóstico , Proteínas Proto-Oncogénicas c-akt/genética , Transducción de Señal/genética , Transcriptoma/genética
3.
Nucleic Acids Res ; 46(10): e60, 2018 06 01.
Artículo en Inglés | MEDLINE | ID: mdl-29562348

RESUMEN

Pathway-based analysis in genome-wide association study (GWAS) is being widely used to uncover novel multi-genic functional associations. Many of these pathway-based methods have been used to test the enrichment of the associated genes in the pathways, but exhibited low powers and were highly affected by free parameters. We present the novel method and software GSA-SNP2 for pathway enrichment analysis of GWAS P-value data. GSA-SNP2 provides high power, decent type I error control and fast computation by incorporating the random set model and SNP-count adjusted gene score. In a comparative study using simulated and real GWAS data, GSA-SNP2 exhibited high power and best prioritized gold standard positive pathways compared with six existing enrichment-based methods and two self-contained methods (alternative pathway analysis approach). Based on these results, the difference between pathway analysis approaches was investigated and the effects of the gene correlation structures on the pathway enrichment analysis were also discussed. In addition, GSA-SNP2 is able to visualize protein interaction networks within and across the significant pathways so that the user can prioritize the core subnetworks for further studies. GSA-SNP2 is freely available at https://sourceforge.net/projects/gsasnp2.


Asunto(s)
Estudio de Asociación del Genoma Completo/métodos , Programas Informáticos , Pueblo Asiatico/genética , Estatura/genética , Bases de Datos Genéticas , Diabetes Mellitus Tipo 2/genética , Humanos , Polimorfismo de Nucleótido Simple , Lenguajes de Programación , Mapas de Interacción de Proteínas
4.
Nat Commun ; 14(1): 1570, 2023 03 21.
Artículo en Inglés | MEDLINE | ID: mdl-36944632

RESUMEN

Integration of single-cell RNA sequencing data between different samples has been a major challenge for analyzing cell populations. However, strategies to integrate differential expression analysis of single-cell data remain underinvestigated. Here, we benchmark 46 workflows for differential expression analysis of single-cell data with multiple batches. We show that batch effects, sequencing depth and data sparsity substantially impact their performances. Notably, we find that the use of batch-corrected data rarely improves the analysis for sparse data, whereas batch covariate modeling improves the analysis for substantial batch effects. We show that for low depth data, single-cell techniques based on zero-inflation model deteriorate the performance, whereas the analysis of uncorrected data using limmatrend, Wilcoxon test and fixed effects model performs well. We suggest several high-performance methods under different conditions based on various simulation and real data analyses. Additionally, we demonstrate that differential expression analysis for a specific cell type outperforms that of large-scale bulk sample data in prioritizing disease-related genes.


Asunto(s)
Benchmarking , Análisis de Datos , Análisis de Secuencia de ARN/métodos , Benchmarking/métodos , Simulación por Computador , Flujo de Trabajo , Análisis de la Célula Individual/métodos , Perfilación de la Expresión Génica/métodos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA