Búsqueda | Portal Regional de la BVS

NestedBD: Bayesian inference of phylogenetic trees from single-cell copy number profiles under a birth-death model.

Liu, Yushu; Edrisi, Mohammadamin; Yan, Zhi; A Ogilvie, Huw; Nakhleh, Luay.

Algorithms Mol Biol ; 19(1): 18, 2024 Apr 29.

Artículo en Inglés | MEDLINE | ID: mdl-38685065

RESUMEN

Copy number aberrations (CNAs) are ubiquitous in many types of cancer. Inferring CNAs from cancer genomic data could help shed light on the initiation, progression, and potential treatment of cancer. While such data have traditionally been available via "bulk sequencing," the more recently introduced techniques for single-cell DNA sequencing (scDNAseq) provide the type of data that makes CNA inference possible at the single-cell resolution. We introduce a new birth-death evolutionary model of CNAs and a Bayesian method, NestedBD, for the inference of evolutionary trees (topologies and branch lengths with relative mutation rates) from single-cell data. We evaluated NestedBD's performance using simulated data sets, benchmarking its accuracy against traditional phylogenetic tools as well as state-of-the-art methods. The results show that NestedBD infers more accurate topologies and branch lengths, and that the birth-death model can improve the accuracy of copy number estimation. And when applied to biological data sets, NestedBD infers plausible evolutionary histories of two colorectal cancer samples. NestedBD is available at https://github.com/Androstane/NestedBD .

Accurate integration of single-cell DNA and RNA for analyzing intratumor heterogeneity using MaCroDNA.

Edrisi, Mohammadamin; Huang, Xiru; Ogilvie, Huw A; Nakhleh, Luay.

Nat Commun ; 14(1): 8262, 2023 Dec 13.

Artículo en Inglés | MEDLINE | ID: mdl-38092737

RESUMEN

Cancers develop and progress as mutations accumulate, and with the advent of single-cell DNA and RNA sequencing, researchers can observe these mutations and their transcriptomic effects and predict proteomic changes with remarkable temporal and spatial precision. However, to connect genomic mutations with their transcriptomic and proteomic consequences, cells with either only DNA data or only RNA data must be mapped to a common domain. For this purpose, we present MaCroDNA, a method that uses maximum weighted bipartite matching of per-gene read counts from single-cell DNA and RNA-seq data. Using ground truth information from colorectal cancer data, we demonstrate the advantage of MaCroDNA over existing methods in accuracy and speed. Exemplifying the utility of single-cell data integration in cancer research, we suggest, based on results derived using MaCroDNA, that genomic mutations of large effect size increasingly contribute to differential expression between cells as Barrett's esophagus progresses to esophageal cancer, reaffirming the findings of the previous studies.

Asunto(s)

Adenocarcinoma , Esófago de Barrett , Neoplasias Esofágicas , Humanos , Adenocarcinoma/genética , ARN/genética , Proteómica , Esófago de Barrett/genética , Neoplasias Esofágicas/patología , ADN

Phylovar: toward scalable phylogeny-aware inference of single-nucleotide variations from single-cell DNA sequencing data.

Edrisi, Mohammadamin; Valecha, Monica V; Chowdary, Sunkara B V; Robledo, Sergio; Ogilvie, Huw A; Posada, David; Zafar, Hamim; Nakhleh, Luay.

Bioinformatics ; 38(Suppl 1): i195-i202, 2022 06 24.

Artículo en Inglés | MEDLINE | ID: mdl-35758771

RESUMEN

MOTIVATION: Single-nucleotide variants (SNVs) are the most common variations in the human genome. Recently developed methods for SNV detection from single-cell DNA sequencing data, such as SCIΦ and scVILP, leverage the evolutionary history of the cells to overcome the technical errors associated with single-cell sequencing protocols. Despite being accurate, these methods are not scalable to the extensive genomic breadth of single-cell whole-genome (scWGS) and whole-exome sequencing (scWES) data. RESULTS: Here, we report on a new scalable method, Phylovar, which extends the phylogeny-guided variant calling approach to sequencing datasets containing millions of loci. Through benchmarking on simulated datasets under different settings, we show that, Phylovar outperforms SCIΦ in terms of running time while being more accurate than Monovar (which is not phylogeny-aware) in terms of SNV detection. Furthermore, we applied Phylovar to two real biological datasets: an scWES triple-negative breast cancer data consisting of 32 cells and 3375 loci as well as an scWGS data of neuron cells from a normal human brain containing 16 cells and approximately 2.5 million loci. For the cancer data, Phylovar detected somatic SNVs with high or moderate functional impact that were also supported by bulk sequencing dataset and for the neuron dataset, Phylovar identified 5745 SNVs with non-synonymous effects some of which were associated with neurodegenerative diseases. AVAILABILITY AND IMPLEMENTATION: Phylovar is implemented in Python and is publicly available at https://github.com/NakhlehLab/Phylovar.

Asunto(s)

Secuenciación de Nucleótidos de Alto Rendimiento , Nucleótidos , Genoma Humano , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Filogenia , Análisis de Secuencia de ADN

Current progress and open challenges for applying deep learning across the biosciences.

Sapoval, Nicolae; Aghazadeh, Amirali; Nute, Michael G; Antunes, Dinler A; Balaji, Advait; Baraniuk, Richard; Barberan, C J; Dannenfelser, Ruth; Dun, Chen; Edrisi, Mohammadamin; Elworth, R A Leo; Kille, Bryce; Kyrillidis, Anastasios; Nakhleh, Luay; Wolfe, Cameron R; Yan, Zhi; Yao, Vicky; Treangen, Todd J.

Nat Commun ; 13(1): 1728, 2022 04 01.

Artículo en Inglés | MEDLINE | ID: mdl-35365602

RESUMEN

Deep Learning (DL) has recently enabled unprecedented advances in one of the grand challenges in computational biology: the half-century-old problem of protein structure prediction. In this paper we discuss recent advances, limitations, and future perspectives of DL on five broad areas: protein structure prediction, protein function prediction, genome engineering, systems biology and data integration, and phylogenetic inference. We discuss each application area and cover the main bottlenecks of DL approaches, such as training data, problem scope, and the ability to leverage existing DL architectures in new contexts. To conclude, we provide a summary of the subject-specific and general challenges for DL across the biosciences.

Asunto(s)

Aprendizaje Profundo , Biología Computacional , Filogenia , Proteínas , Biología de Sistemas

Methods for copy number aberration detection from single-cell DNA-sequencing data.

Mallory, Xian F; Edrisi, Mohammadamin; Navin, Nicholas; Nakhleh, Luay.

Genome Biol ; 21(1): 208, 2020 08 17.

Artículo en Inglés | MEDLINE | ID: mdl-32807205

RESUMEN

Copy number aberrations (CNAs), which are pathogenic copy number variations (CNVs), play an important role in the initiation and progression of cancer. Single-cell DNA-sequencing (scDNAseq) technologies produce data that is ideal for inferring CNAs. In this review, we review eight methods that have been developed for detecting CNAs in scDNAseq data, and categorize them according to the steps of a seven-step pipeline that they employ. Furthermore, we review models and methods for evolutionary analyses of CNAs from scDNAseq data and highlight advances and future research directions for computational methods for CNA detection from scDNAseq data.

Asunto(s)

Secuencia de Bases , Biología Computacional/métodos , Variaciones en el Número de Copia de ADN , Análisis de Secuencia de ADN/métodos , Aberraciones Cromosómicas , ADN , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Neoplasias/genética

Assessing the performance of methods for copy number aberration detection from single-cell DNA sequencing data.

Mallory, Xian F; Edrisi, Mohammadamin; Navin, Nicholas; Nakhleh, Luay.

PLoS Comput Biol ; 16(7): e1008012, 2020 07.

Artículo en Inglés | MEDLINE | ID: mdl-32658894

RESUMEN

Single-cell DNA sequencing technologies are enabling the study of mutations and their evolutionary trajectories in cancer. Somatic copy number aberrations (CNAs) have been implicated in the development and progression of various types of cancer. A wide array of methods for CNA detection has been either developed specifically for or adapted to single-cell DNA sequencing data. Understanding the strengths and limitations that are unique to each of these methods is very important for obtaining accurate copy number profiles from single-cell DNA sequencing data. We benchmarked three widely used methods-Ginkgo, HMMcopy, and CopyNumber-on simulated as well as real datasets. To facilitate this, we developed a novel simulator of single-cell genome evolution in the presence of CNAs. Furthermore, to assess performance on empirical data where the ground truth is unknown, we introduce a phylogeny-based measure for identifying potentially erroneous inferences. While single-cell DNA sequencing is very promising for elucidating and understanding CNAs, our findings show that even the best existing method does not exceed 80% accuracy. New methods that significantly improve upon the accuracy of these three methods are needed. Furthermore, with the large datasets being generated, the methods must be computationally efficient.

Asunto(s)

Variaciones en el Número de Copia de ADN , Genoma Humano , Análisis de Secuencia de ADN/métodos , Análisis de la Célula Individual/métodos , Algoritmos , Aberraciones Cromosómicas , Biología Computacional , Simulación por Computador , Dosificación de Gen , Humanos , Mutación , Neoplasias/genética , Ploidias , Distribución de Poisson , Curva ROC , Reproducibilidad de los Resultados , Programas Informáticos

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA