Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Más filtros












Base de datos
Intervalo de año de publicación
1.
Gigascience ; 132024 01 02.
Artículo en Inglés | MEDLINE | ID: mdl-38832466

RESUMEN

BACKGROUND: Due to human error, sample swapping in large cohort studies with heterogeneous data types (e.g., mix of Oxford Nanopore Technologies, Pacific Bioscience, Illumina data, etc.) remains a common issue plaguing large-scale studies. At present, all sample swapping detection methods require costly and unnecessary (e.g., if data are only used for genome assembly) alignment, positional sorting, and indexing of the data in order to compare similarly. As studies include more samples and new sequencing data types, robust quality control tools will become increasingly important. FINDINGS: The similarity between samples can be determined using indexed k-mer sequence variants. To increase statistical power, we use coverage information on variant sites, calculating similarity using a likelihood ratio-based test. Per sample error rate, and coverage bias (i.e., missing sites) can also be estimated with this information, which can be used to determine if a spatially indexed principal component analysis (PCA)-based prescreening method can be used, which can greatly speed up analysis by preventing exhaustive all-to-all comparisons. CONCLUSIONS: Because this tool processes raw data, is faster than alignment, and can be used on very low-coverage data, it can save an immense degree of computational resources in standard quality control (QC) pipelines. It is robust enough to be used on different sequencing data types, important in studies that leverage the strengths of different sequencing technologies. In addition to its primary use case of sample swap detection, this method also provides information useful in QC, such as error rate and coverage bias, as well as population-level PCA ancestry analysis visualization.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento , Análisis de Secuencia de ADN , Humanos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Análisis de Componente Principal , Biología Computacional/métodos , Algoritmos
2.
bioRxiv ; 2024 Mar 08.
Artículo en Inglés | MEDLINE | ID: mdl-38496580

RESUMEN

Pediatric high-grade glioma (pHGG) is an incurable central nervous system malignancy that is a leading cause of pediatric cancer death. While pHGG shares many similarities to adult glioma, it is increasingly recognized as a molecularly distinct, yet highly heterogeneous disease. In this study, we longitudinally profiled a molecularly diverse cohort of 16 pHGG patients before and after standard therapy through single-nucleus RNA and ATAC sequencing, whole-genome sequencing, and CODEX spatial proteomics to capture the evolution of the tumor microenvironment during progression following treatment. We found that the canonical neoplastic cell phenotypes of adult glioblastoma are insufficient to capture the range of tumor cell states in a pediatric cohort and observed differential tumor-myeloid interactions between malignant cell states. We identified key transcriptional regulators of pHGG cell states and did not observe the marked proneural to mesenchymal shift characteristic of adult glioblastoma. We showed that essential neuromodulators and the interferon response are upregulated post-therapy along with an increase in non-neoplastic oligodendrocytes. Through in vitro pharmacological perturbation, we demonstrated novel malignant cell-intrinsic targets. This multiomic atlas of longitudinal pHGG captures the key features of therapy response that support distinction from its adult counterpart and suggests therapeutic strategies which are targeted to pediatric gliomas.

3.
Genome Biol ; 25(1): 14, 2024 01 12.
Artículo en Inglés | MEDLINE | ID: mdl-38217002

RESUMEN

Existing methods for analysis of spatial transcriptomic data focus on delineating the global gene expression variations of cell types across the tissue, rather than local gene expression changes driven by cell-cell interactions. We propose a new statistical procedure called niche-differential expression (niche-DE) analysis that identifies cell-type-specific niche-associated genes, which are differentially expressed within a specific cell type in the context of specific spatial niches. We further develop niche-LR, a method to reveal ligand-receptor signaling mechanisms that underlie niche-differential gene expression patterns. Niche-DE and niche-LR are applicable to low-resolution spot-based spatial transcriptomics data and data that is single-cell or subcellular in resolution.


Asunto(s)
Perfilación de la Expresión Génica , Transcriptoma , Comunicación Celular
4.
Nat Methods ; 21(2): 267-278, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38191930

RESUMEN

It is poorly understood how different cells in a tissue organize themselves to support tissue functions. We describe the CytoCommunity algorithm for the identification of tissue cellular neighborhoods (TCNs) based on cell phenotypes and their spatial distributions. CytoCommunity learns a mapping directly from the cell phenotype space to the TCN space using a graph neural network model without intermediate clustering of cell embeddings. By leveraging graph pooling, CytoCommunity enables de novo identification of condition-specific and predictive TCNs under the supervision of sample labels. Using several types of spatial omics data, we demonstrate that CytoCommunity can identify TCNs of variable sizes with substantial improvement over existing methods. By analyzing risk-stratified colorectal and breast cancer data, CytoCommunity revealed new granulocyte-enriched and cancer-associated fibroblast-enriched TCNs specific to high-risk tumors and altered interactions between neoplastic and immune or stromal cells within and between TCNs. CytoCommunity can perform unsupervised and supervised analyses of spatial omics maps and enable the discovery of condition-specific cell-cell communication patterns across spatial scales.


Asunto(s)
Algoritmos , Redes Neurales de la Computación , Análisis por Conglomerados , Fenotipo
5.
bioRxiv ; 2023 Feb 06.
Artículo en Inglés | MEDLINE | ID: mdl-36747708

RESUMEN

Barrett's esophagus is a common type of metaplasia and a precursor of esophageal adenocarcinoma. However, the cell states and lineage connections underlying the origin, maintenance, and progression of Barrett's esophagus have not been resolved in humans. To address this, we performed single-cell lineage tracing and transcriptional profiling of patient cells isolated from metaplastic and healthy tissue. Our analysis revealed discrete lineages in Barrett's esophagus, normal esophagus, and gastric cardia. Transitional basal progenitor cells of the gastroesophageal junction were unexpectedly related to both esophagus and gastric cardia cells. Barrett's esophagus was polyclonal, with lineages that contained all progenitor and differentiated cell types. In contrast, precancerous dysplastic foci were initiated by the expansion of a single molecularly aberrant Barrett's esophagus clone. Together, these findings provide a comprehensive view of the cell dynamics of Barrett's esophagus, linking cell states along the full disease trajectory, from its origin to cancer.

6.
Bioinformatics ; 37(9): 1315-1316, 2021 06 09.
Artículo en Inglés | MEDLINE | ID: mdl-32966548

RESUMEN

SUMMARY: We present bedtk, a new toolkit for manipulating genomic intervals in the BED format. It supports sorting, merging, intersection, subtraction and the calculation of the breadth of coverage. Bedtk uses implicit interval tree, a data structure for fast interval overlap queries. It is several to tens of times faster than existing tools and tends to use less memory. AVAILABILITY AND IMPLEMENTATION: The source code is available at https://github.com/lh3/bedtk.


Asunto(s)
Programas Informáticos , Árboles , Genoma , Genómica
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...