Pesquisa | Prevenção e Controle de Câncer

1.

Joint inference of cell lineage and mitochondrial evolution from single-cell sequencing data.

Sashittal, Palash; Chen, Viola; Pasarkar, Amey; Raphael, Benjamin J.

Bioinformatics ; 40(Supplement_1): i218-i227, 2024 Jun 28.

Artigo em Inglês | MEDLINE | ID: mdl-38940122

RESUMO

MOTIVATION: Eukaryotic cells contain organelles called mitochondria that have their own genome. Most cells contain thousands of mitochondria which replicate, even in nondividing cells, by means of a relatively error-prone process resulting in somatic mutations in their genome. Because of the higher mutation rate compared to the nuclear genome, mitochondrial mutations have been used to track cellular lineage, particularly using single-cell sequencing that measures mitochondrial mutations in individual cells. However, existing methods to infer the cell lineage tree from mitochondrial mutations do not model "heteroplasmy," which is the presence of multiple mitochondrial clones with distinct sets of mutations in an individual cell. Single-cell sequencing data thus provide a mixture of the mitochondrial clones in individual cells, with the ancestral relationships between these clones described by a mitochondrial clone tree. While deconvolution of somatic mutations from a mixture of evolutionarily related genomes has been extensively studied in the context of bulk sequencing of cancer tumor samples, the problem of mitochondrial deconvolution has the additional constraint that the mitochondrial clone tree must be concordant with the cell lineage tree. RESULTS: We formalize the problem of inferring a concordant pair of a mitochondrial clone tree and a cell lineage tree from single-cell sequencing data as the Nested Perfect Phylogeny Mixture (NPPM) problem. We derive a combinatorial characterization of the solutions to the NPPM problem, and formulate an algorithm, MERLIN, to solve this problem exactly using a mixed integer linear program. We show on simulated data that MERLIN outperforms existing methods that do not model mitochondrial heteroplasmy nor the concordance between the mitochondrial clone tree and the cell lineage tree. We use MERLIN to analyze single-cell whole-genome sequencing data of 5220 cells of a gastric cancer cell line and show that MERLIN infers a more biologically plausible cell lineage tree and mitochondrial clone tree compared to existing methods. AVAILABILITY AND IMPLEMENTATION: https://github.com/raphael-group/MERLIN.

Assuntos

Linhagem da Célula , Mitocôndrias , Análise de Célula Única , Análise de Célula Única/métodos , Humanos , Linhagem da Célula/genética , Mitocôndrias/genética , Mutação , Genoma Mitocondrial , Algoritmos , Evolução Molecular

2.

HATCHet2: clone- and haplotype-specific copy number inference from bulk tumor sequencing data.

Myers, Matthew A; Arnold, Brian J; Bansal, Vineet; Balaban, Metin; Mullen, Katelyn M; Zaccaria, Simone; Raphael, Benjamin J.

Genome Biol ; 25(1): 130, 2024 05 21.

Artigo em Inglês | MEDLINE | ID: mdl-38773520

RESUMO

Bulk DNA sequencing of multiple samples from the same tumor is becoming common, yet most methods to infer copy-number aberrations (CNAs) from this data analyze individual samples independently. We introduce HATCHet2, an algorithm to identify haplotype- and clone-specific CNAs simultaneously from multiple bulk samples. HATCHet2 extends the earlier HATCHet method by improving identification of focal CNAs and introducing a novel statistic, the minor haplotype B-allele frequency (mhBAF), that enables identification of mirrored-subclonal CNAs. We demonstrate HATCHet2's improved accuracy using simulations and a single-cell sequencing dataset. HATCHet2 analysis of 10 prostate cancer patients reveals previously unreported mirrored-subclonal CNAs affecting cancer genes.

Assuntos

Algoritmos , Variações do Número de Cópias de DNA , Haplótipos , Neoplasias da Próstata , Humanos , Neoplasias da Próstata/genética , Masculino , Análise de Sequência de DNA/métodos , Neoplasias/genética , Frequência do Gene , Análise de Célula Única

3.

A latent variable model for evaluating mutual exclusivity and co-occurrence between driver mutations in cancer.

Shuaibi, Ahmed; Chitra, Uthsav; Raphael, Benjamin J.

bioRxiv ; 2024 Apr 27.

Artigo em Inglês | MEDLINE | ID: mdl-38712136

RESUMO

A key challenge in cancer genomics is understanding the functional relationships and dependencies between combinations of somatic mutations that drive cancer development. Such driver mutations frequently exhibit patterns of mutual exclusivity or co-occurrence across tumors, and many methods have been developed to identify such dependency patterns from bulk DNA sequencing data of a cohort of patients. However, while mutual exclusivity and co-occurrence are described as properties of driver mutations, existing methods do not explicitly disentangle functional, driver mutations from neutral, passenger mutations. In particular, nearly all existing methods evaluate mutual exclusivity or co-occurrence at the gene level, marking a gene as mutated if any mutation - driver or passenger - is present. Since some genes have a large number of passenger mutations, existing methods either restrict their analyses to a small subset of suspected driver genes - limiting their ability to identify novel dependencies - or make spurious inferences of mutual exclusivity and co-occurrence involving genes with many passenger mutations. We introduce DIALECT, an algorithm to identify dependencies between pairs of driver mutations from somatic mutation counts. We derive a latent variable mixture model for drivers and passengers that combines existing probabilistic models of passenger mutation rates with a latent variable describing the unknown status of a mutation as a driver or passenger. We use an expectation maximization (EM) algorithm to estimate the parameters of our model, including the rates of mutually exclusivity and co-occurrence between drivers. We demonstrate that DIALECT more accurately infers mutual exclusivity and co-occurrence between driver mutations compared to existing methods on both simulated mutation data and somatic mutation data from 5 cancer types in The Cancer Genome Atlas (TCGA).

4.

Maximum Likelihood Inference of Time-scaled Cell Lineage Trees with Mixed-type Missing Data.

Mai, Uyen; Chu, Gillian; Raphael, Benjamin J.

bioRxiv ; 2024 Mar 23.

Artigo em Inglês | MEDLINE | ID: mdl-38496496

RESUMO

Recent dynamic lineage tracing technologies combine CRISPR-based genome editing with single-cell sequencing to track cell divisions during development. A key computational problem in dynamic lineage tracing is to infer a cell lineage tree from the measured CRISPR-induced mutations. Three features of dynamic lineage tracing data distinguish this problem from standard phylogenetic tree inference. First, the CRISPR-editing process modifies a genomic location exactly once. This non-modifiable property is not well described by the time-reversible models commonly used in phylogenetics. Second, as a consequence of non-modifiability, the number of mutations per time unit decreases over time. Third, CRISPR-based genome-editing and single-cell sequencing results in high rates of both heritable and non-heritable (dropout) missing data. To model these features, we introduce the Probabilistic Mixed-type Missing (PMM) model. We describe an algorithm, LAML (Lineage Analysis via Maximum Likelihood), to search for the maximum likelihood (ML) tree under the PMM model. LAML combines an Expectation Maximization (EM) algorithm with a heuristic tree search to jointly estimate tree topology, branch lengths and missing data parameters. We derive a closed-form solution for the M-step in the case of no heritable missing data, and a block coordinate ascent approach in the general case which is more efficient than the standard General Time Reversible (GTR) phylogenetic model. On simulated data, LAML infers more accurate tree topologies and branch lengths than existing methods, with greater advantages on datasets with higher ratios of heritable to non-heritable missing data. We show that LAML provides unbiased time-scaled estimates of branch lengths. In contrast, we demonstrate that maximum parsimony methods for lineage tracing data not only underestimate branch lengths, but also yield branch lengths which are not proportional to time, due to the nonlinear decay in the number of mutations on branches further from the root. On lineage tracing data from a mouse model of lung adenocarcinoma, we show that LAML infers phylogenetic distances that are more concordant with gene expression data compared to distances derived from maximum parsimony. The LAML tree topology is more plausible than existing published trees, with fewer total cell migrations between distant metastases and fewer reseeding events where cells migrate back to the primary tumor. Crucially, we identify three distinct time epochs of metastasis progression, which includes a burst of metastasis events to various anatomical sites during a single month.

5.

ConDoR: tumor phylogeny inference with a copy-number constrained mutation loss model.

Sashittal, Palash; Zhang, Haochen; Iacobuzio-Donahue, Christine A; Raphael, Benjamin J.

Genome Biol ; 24(1): 272, 2023 Nov 30.

Artigo em Inglês | MEDLINE | ID: mdl-38037115

RESUMO

A tumor contains a diverse collection of somatic mutations that reflect its past evolutionary history and that range in scale from single nucleotide variants (SNVs) to large-scale copy-number aberrations (CNAs). However, no current single-cell DNA sequencing (scDNA-seq) technology produces accurate measurements of both SNVs and CNAs, complicating the inference of tumor phylogenies. We introduce a new evolutionary model, the constrained k-Dollo model, that uses SNVs as phylogenetic markers but constrains losses of SNVs according to clusters of cells. We derive an algorithm, ConDoR, that infers phylogenies from targeted scDNA-seq data using this model. We demonstrate the advantages of ConDoR on simulated and real scDNA-seq data.

Assuntos

Neoplasias , Humanos , Animais , Filogenia , Neoplasias/genética , Mutação , Algoritmos , Análise de Sequência de DNA , Aves/genética , Variações do Número de Cópias de DNA

6.

Startle: A star homoplasy approach for CRISPR-Cas9 lineage tracing.

Sashittal, Palash; Schmidt, Henri; Chan, Michelle; Raphael, Benjamin J.

Cell Syst ; 14(12): 1113-1121.e9, 2023 12 20.

Artigo em Inglês | MEDLINE | ID: mdl-38128483

RESUMO

CRISPR-Cas9-based genome editing combined with single-cell sequencing enables the tracing of the history of cell divisions, or cellular lineage, in tissues and whole organisms. Although standard phylogenetic approaches may be applied to reconstruct cellular lineage trees from this data, the unique features of the CRISPR-Cas9 editing process motivate the development of specialized models that describe the evolution of CRISPR-Cas9-induced mutations. Here, we introduce the "star homoplasy" evolutionary model that constrains a phylogenetic character to mutate at most once along a lineage, capturing the "non-modifiability" property of CRISPR-Cas9 mutations. We derive a combinatorial characterization of star homoplasy phylogenies and use this characterization to develop an algorithm, "Startle", that computes a maximum parsimony star homoplasy phylogeny. We demonstrate that Startle infers more accurate phylogenies on simulated lineage tracing data compared with existing methods and finds parsimonious phylogenies with fewer metastatic migrations on lineage tracing data from mouse metastatic lung adenocarcinoma.

Assuntos

Sistemas CRISPR-Cas , Edição de Genes , Animais , Camundongos , Sistemas CRISPR-Cas/genética , Filogenia , Edição de Genes/métodos , Linhagem da Célula/genética , Mutação

7.

Epigenetic regulation during cancer transitions across 11 tumour types.

Terekhanova, Nadezhda V; Karpova, Alla; Liang, Wen-Wei; Strzalkowski, Alexander; Chen, Siqi; Li, Yize; Southard-Smith, Austin N; Iglesia, Michael D; Wendl, Michael C; Jayasinghe, Reyka G; Liu, Jingxian; Song, Yizhe; Cao, Song; Houston, Andrew; Liu, Xiuting; Wyczalkowski, Matthew A; Lu, Rita Jui-Hsien; Caravan, Wagma; Shinkle, Andrew; Naser Al Deen, Nataly; Herndon, John M; Mudd, Jacqueline; Ma, Cong; Sarkar, Hirak; Sato, Kazuhito; Ibrahim, Omar M; Mo, Chia-Kuei; Chasnoff, Sara E; Porta-Pardo, Eduard; Held, Jason M; Pachynski, Russell; Schwarz, Julie K; Gillanders, William E; Kim, Albert H; Vij, Ravi; DiPersio, John F; Puram, Sidharth V; Chheda, Milan G; Fuh, Katherine C; DeNardo, David G; Fields, Ryan C; Chen, Feng; Raphael, Benjamin J; Ding, Li.

Nature ; 623(7986): 432-441, 2023 Nov.

Artigo em Inglês | MEDLINE | ID: mdl-37914932

RESUMO

Chromatin accessibility is essential in regulating gene expression and cellular identity, and alterations in accessibility have been implicated in driving cancer initiation, progression and metastasis1-4. Although the genetic contributions to oncogenic transitions have been investigated, epigenetic drivers remain less understood. Here we constructed a pan-cancer epigenetic and transcriptomic atlas using single-nucleus chromatin accessibility data (using single-nucleus assay for transposase-accessible chromatin) from 225 samples and matched single-cell or single-nucleus RNA-sequencing expression data from 206 samples. With over 1 million cells from each platform analysed through the enrichment of accessible chromatin regions, transcription factor motifs and regulons, we identified epigenetic drivers associated with cancer transitions. Some epigenetic drivers appeared in multiple cancers (for example, regulatory regions of ABCC1 and VEGFA; GATA6 and FOX-family motifs), whereas others were cancer specific (for example, regulatory regions of FGF19, ASAP2 and EN1, and the PBX3 motif). Among epigenetically altered pathways, TP53, hypoxia and TNF signalling were linked to cancer initiation, whereas oestrogen response, epithelial-mesenchymal transition and apical junction were tied to metastatic transition. Furthermore, we revealed a marked correlation between enhancer accessibility and gene expression and uncovered cooperation between epigenetic and genetic drivers. This atlas provides a foundation for further investigation of epigenetic dynamics in cancer transitions.

Assuntos

Epigênese Genética , Regulação Neoplásica da Expressão Gênica , Neoplasias , Humanos , Hipóxia Celular , Núcleo Celular , Cromatina/genética , Cromatina/metabolismo , Elementos Facilitadores Genéticos/genética , Epigênese Genética/genética , Transição Epitelial-Mesenquimal , Estrogênios/metabolismo , Perfilação da Expressão Gênica , Proteínas Ativadoras de GTPase/metabolismo , Metástase Neoplásica , Neoplasias/classificação , Neoplasias/genética , Neoplasias/patologia , Sequências Reguladoras de Ácido Nucleico/genética , Análise de Célula Única , Fatores de Transcrição/metabolismo

8.

A zero-agnostic model for copy number evolution in cancer.

Schmidt, Henri; Sashittal, Palash; Raphael, Benjamin J.

PLoS Comput Biol ; 19(11): e1011590, 2023 Nov.

Artigo em Inglês | MEDLINE | ID: mdl-37943952

RESUMO

MOTIVATION: New low-coverage single-cell DNA sequencing technologies enable the measurement of copy number profiles from thousands of individual cells within tumors. From this data, one can infer the evolutionary history of the tumor by modeling transformations of the genome via copy number aberrations. Copy number aberrations alter multiple adjacent genomic loci, violating the standard phylogenetic assumption that loci evolve independently. Thus, specialized models to infer copy number phylogenies have been introduced. A widely used model is the copy number transformation (CNT) model in which a genome is represented by an integer vector and a copy number aberration is an event that either increases or decreases the number of copies of a contiguous segment of the genome. The CNT distance between a pair of copy number profiles is the minimum number of events required to transform one profile to another. While this distance can be computed efficiently, no efficient algorithm has been developed to find the most parsimonious phylogeny under the CNT model. RESULTS: We introduce the zero-agnostic copy number transformation (ZCNT) model, a simplification of the CNT model that allows the amplification or deletion of regions with zero copies. We derive a closed form expression for the ZCNT distance between two copy number profiles and show that, unlike the CNT distance, the ZCNT distance forms a metric. We leverage the closed-form expression for the ZCNT distance and an alternative characterization of copy number profiles to derive polynomial time algorithms for two natural relaxations of the small parsimony problem on copy number profiles. While the alteration of zero copy number regions allowed under the ZCNT model is not biologically realistic, we show on both simulated and real datasets that the ZCNT distance is a close approximation to the CNT distance. Extending our polynomial time algorithm for the ZCNT small parsimony problem, we develop an algorithm, Lazac, for solving the large parsimony problem on copy number profiles. We demonstrate that Lazac outperforms existing methods for inferring copy number phylogenies on both simulated and real data.

Assuntos

Variações do Número de Cópias de DNA , Neoplasias , Humanos , Filogenia , Variações do Número de Cópias de DNA/genética , Neoplasias/genética , Genômica/métodos , Genoma , Algoritmos

9.

Mapping the topography of spatial gene expression with interpretable deep learning.

Chitra, Uthsav; Arnold, Brian J; Sarkar, Hirak; Ma, Cong; Lopez-Darwin, Sereno; Sanno, Kohei; Raphael, Benjamin J.

bioRxiv ; 2023 Oct 13.

Artigo em Inglês | MEDLINE | ID: mdl-37873258

RESUMO

Spatially resolved transcriptomics technologies provide high-throughput measurements of gene expression in a tissue slice, but the sparsity of this data complicates the analysis of spatial gene expression patterns such as gene expression gradients. We address these issues by deriving a topographic map of a tissue slice-analogous to a map of elevation in a landscape-using a novel quantity called the isodepth. Contours of constant isodepth enclose spatial domains with distinct cell type composition, while gradients of the isodepth indicate spatial directions of maximum change in gene expression. We develop GASTON, an unsupervised and interpretable deep learning algorithm that simultaneously learns the isodepth, spatial gene expression gradients, and piecewise linear functions of the isodepth that model both continuous gradients and discontinuous spatial variation in the expression of individual genes. We validate GASTON by showing that it accurately identifies spatial domains and marker genes across several biological systems. In SRT data from the brain, GASTON reveals gradients of neuronal differentiation and firing, and in SRT data from a tumor sample, GASTON infers gradients of metabolic activity and epithelial-mesenchymal transition (EMT)-related gene expression in the tumor microenvironment.

10.

Co-evolution of AR gene copy number and structural complexity in endocrine therapy resistant prostate cancer.

Zivanovic, Andrej; Miller, Jeffrey T; Munro, Sarah A; Knutson, Todd P; Li, Yingming; Passow, Courtney N; Simonaitis, Pijus; Lynch, Molly; Oseth, LeAnn; Zhao, Shuang G; Feng, Felix Y; Wikström, Pernilla; Corey, Eva; Morrissey, Colm; Henzler, Christine; Raphael, Benjamin J; Dehm, Scott M.

NAR Cancer ; 5(3): zcad045, 2023 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-37636316

RESUMO

Androgen receptor (AR) inhibition is standard of care for advanced prostate cancer (PC). However, efficacy is limited by progression to castration-resistant PC (CRPC), usually due to AR re-activation via mechanisms that include AR amplification and structural rearrangement. These two classes of AR alterations often co-occur in CRPC tumors, but it is unclear whether this reflects intercellular or intracellular heterogeneity of AR. Resolving this is important for developing new therapies and predictive biomarkers. Here, we analyzed 41 CRPC tumors and 6 patient-derived xenografts (PDXs) using linked-read DNA-sequencing, and identified 7 tumors that developed complex, multiply-rearranged AR gene structures in conjunction with very high AR copy number. Analysis of PDX models by optical genome mapping and fluorescence in situ hybridization showed that AR residing on extrachromosomal DNA (ecDNA) was an underlying mechanism, and was associated with elevated levels and diversity of AR expression. This study identifies co-evolution of AR gene copy number and structural complexity via ecDNA as a mechanism associated with endocrine therapy resistance.

11.

HATCHet2: clone- and haplotype-specific copy number inference from bulk tumor sequencing data.

Myers, Matthew A; Arnold, Brian J; Bansal, Vineet; Mullen, Katelyn M; Zaccaria, Simone; Raphael, Benjamin J.

bioRxiv ; 2023 Jul 15.

Artigo em Inglês | MEDLINE | ID: mdl-37502835

RESUMO

Multi-region DNA sequencing of primary tumors and metastases from individual patients helps identify somatic aberrations driving cancer development. However, most methods to infer copy-number aberrations (CNAs) analyze individual samples. We introduce HATCHet2 to identify haplotype- and clone-specific CNAs simultaneously from multiple bulk samples. HATCHet2 introduces a novel statistic, the mirrored haplotype B-allele frequency (mhBAF), to identify mirrored-subclonal CNAs having different numbers of copies of parental haplotypes in different tumor clones. HATCHet2 also has high accuracy in identifying focal CNAs and extends the earlier HATCHet method in several directions. We demonstrate HATCHet2's improved accuracy using simulations and a single-cell sequencing dataset. HATCHet2 analysis of 50 prostate cancer samples from 10 patients reveals previously-unreported mirrored-subclonal CNAs affecting cancer genes.

12.

Structurally Complex Osteosarcoma Genomes Exhibit Limited Heterogeneity within Individual Tumors and across Evolutionary Time.

Rajan, Sanjana; Zaccaria, Simone; Cannon, Matthew V; Cam, Maren; Gross, Amy C; Raphael, Benjamin J; Roberts, Ryan D.

Cancer Res Commun ; 3(4): 564-575, 2023 04.

Artigo em Inglês | MEDLINE | ID: mdl-37066022

RESUMO

Osteosarcoma is an aggressive malignancy characterized by high genomic complexity. Identification of few recurrent mutations in protein coding genes suggests that somatic copy-number aberrations (SCNA) are the genetic drivers of disease. Models around genomic instability conflict-it is unclear whether osteosarcomas result from pervasive ongoing clonal evolution with continuous optimization of the fitness landscape or an early catastrophic event followed by stable maintenance of an abnormal genome. We address this question by investigating SCNAs in >12,000 tumor cells obtained from human osteosarcomas using single-cell DNA sequencing, with a degree of precision and accuracy not possible when inferring single-cell states using bulk sequencing. Using the CHISEL algorithm, we inferred allele- and haplotype-specific SCNAs from this whole-genome single-cell DNA sequencing data. Surprisingly, despite extensive structural complexity, these tumors exhibit a high degree of cell-cell homogeneity with little subclonal diversification. Longitudinal analysis of patient samples obtained at distant therapeutic timepoints (diagnosis, relapse) demonstrated remarkable conservation of SCNA profiles over tumor evolution. Phylogenetic analysis suggests that the majority of SCNAs were acquired early in the oncogenic process, with relatively few structure-altering events arising in response to therapy or during adaptation to growth in metastatic tissues. These data further support the emerging hypothesis that early catastrophic events, rather than sustained genomic instability, give rise to structural complexity, which is then preserved over long periods of tumor developmental time. Significance: Chromosomally complex tumors are often described as genomically unstable. However, determining whether complexity arises from remote time-limited events that give rise to structural alterations or a progressive accumulation of structural events in persistently unstable tumors has implications for diagnosis, biomarker assessment, mechanisms of treatment resistance, and represents a conceptual advance in our understanding of intratumoral heterogeneity and tumor evolution.

Assuntos

Neoplasias Ósseas , Osteossarcoma , Humanos , Filogenia , Variações do Número de Cópias de DNA/genética , Recidiva Local de Neoplasia , Osteossarcoma/genética , Instabilidade Genômica/genética , Neoplasias Ósseas/genética

13.

A zero-agnostic model for copy number evolution in cancer.

Schmidt, Henri; Sashittal, Palash; Raphael, Benjamin J.

bioRxiv ; 2023 Apr 12.

Artigo em Inglês | MEDLINE | ID: mdl-37090633

RESUMO

Motivation: New low-coverage single-cell DNA sequencing technologies enable the measurement of copy number profiles from thousands of individual cells within tumors. From this data, one can infer the evolutionary history of the tumor by modeling transformations of the genome via copy number aberrations. A widely used model to infer such copy number phylogenies is the copy number transformation (CNT) model in which a genome is represented by an integer vector and a copy number aberration is an event that either increases or decreases the number of copies of a contiguous segment of the genome. The CNT distance between a pair of copy number profiles is the minimum number of events required to transform one profile to another. While this distance can be computed efficiently, no efficient algorithm has been developed to find the most parsimonious phylogeny under the CNT model. Results: We introduce the zero-agnostic copy number transformation (ZCNT) model, a simplification of the CNT model that allows the amplification or deletion of regions with zero copies. We derive a closed form expression for the ZCNT distance between two copy number profiles and show that, unlike the CNT distance, the ZCNT distance forms a metric. We leverage the closed-form expression for the ZCNT distance and an alternative characterization of copy number profiles to derive polynomial time algorithms for two natural relaxations of the small parsimony problem on copy number profiles. While the alteration of zero copy number regions allowed under the ZCNT model is not biologically realistic, we show on both simulated and real datasets that the ZCNT distance is a close approximation to the CNT distance. Extending our polynomial time algorithm for the ZCNT small parsimony problem, we develop an algorithm, Lazac, for solving the large parsimony problem on copy number profiles. We demonstrate that Lazac outperforms existing methods for inferring copy number phylogenies on both simulated and real data.

14.

ConDoR: Tumor phylogeny inference with a copy-number constrained mutation loss model.

Sashittal, Palash; Zhang, Haochen; Iacobuzio-Donahue, Christine A; Raphael, Benjamin J.

bioRxiv ; 2023 Jan 06.

Artigo em Inglês | MEDLINE | ID: mdl-36711528

RESUMO

Tumors consist of subpopulations of cells that harbor distinct collections of somatic mutations. These mutations range in scale from single nucleotide variants (SNVs) to large-scale copy-number aberrations (CNAs). While many approaches infer tumor phylogenies using SNVs as phylogenetic markers, CNAs that overlap SNVs may lead to erroneous phylogenetic inference. Specifically, an SNV may be lost in a cell due to a deletion of the genomic segment containing the SNV. Unfortunately, no current single-cell DNA sequencing (scDNA-seq) technology produces accurate measurements of both SNVs and CNAs. For instance, recent targeted scDNA-seq technologies, such as Mission Bio Tapestri, measure SNVs with high fidelity in individual cells, but yield much less reliable measurements of CNAs. We introduce a new evolutionary model, the constrained k-Dollo model, that uses SNVs as phylogenetic markers and partial information about CNAs in the form of clustering of cells with similar copy-number profiles. This copy-number clustering constrains where loss of SNVs can occur in the phylogeny. We develop ConDoR (Constrained Dollo Reconstruction), an algorithm to infer tumor phylogenies from targeted scDNA-seq data using the constrained k-Dollo model. We show that ConDoR outperforms existing methods on simulated data. We use ConDoR to analyze a new multi-region targeted scDNA-seq dataset of 2153 cells from a pancreatic ductal adenocarcinoma (PDAC) tumor and produce a more plausible phylogeny compared to existing methods that conforms to histological results for the tumor from a previous study. We also analyze a metastatic colorectal cancer dataset, deriving a more parsimonious phylogeny than previously published analyses and with a simpler monoclonal origin of metastasis compared to the original study. Code availability: Software is available at https://github.com/raphael-group/constrained-Dollo.

15.

Author Correction: Pathway and network analysis of more than 2500 whole cancer genomes.

Reyna, Matthew A; Haan, David; Paczkowska, Marta; Verbeke, Lieven P C; Vazquez, Miguel; Kahraman, Abdullah; Pulido-Tamayo, Sergio; Barenboim, Jonathan; Wadi, Lina; Dhingra, Priyanka; Shrestha, Raunak; Getz, Gad; Lawrence, Michael S; Pedersen, Jakob Skou; Rubin, Mark A; Wheeler, David A; Brunak, Søren; Izarzugaza, Jose M G; Khurana, Ekta; Marchal, Kathleen; von Mering, Christian; Sahinalp, S Cenk; Valencia, Alfonso; Reimand, Jüri; Stuart, Joshua M; Raphael, Benjamin J.

Nat Commun ; 13(1): 7566, 2022 Dec 08.

Artigo em Inglês | MEDLINE | ID: mdl-36481610

16.

NetMix2: A Principled Network Propagation Algorithm for Identifying Altered Subnetworks.

Chitra, Uthsav; Park, Tae Yoon; Raphael, Benjamin J.

J Comput Biol ; 29(12): 1305-1323, 2022 12.

Artigo em Inglês | MEDLINE | ID: mdl-36525308

RESUMO

A standard paradigm in computational biology is to leverage interaction networks as prior knowledge in analyzing high-throughput biological data, where the data give a score for each vertex in the network. One classical approach is the identification of altered subnetworks, or subnetworks of the interaction network that have both outlier vertex scores and a defined network topology. One class of algorithms for identifying altered subnetworks search for high-scoring subnetworks in subnetwork families with simple topological constraints, such as connected subnetworks, and have sound statistical guarantees. A second class of algorithms employ network propagation-the smoothing of vertex scores over the network using a random walk or diffusion process-and utilize the global structure of the network. However, network propagation algorithms often rely on ad hoc heuristics that lack a rigorous statistical foundation. In this work, we unify the subnetwork family and network propagation approaches by deriving the propagation family, a subnetwork family that approximates the sets of vertices ranked highly by network propagation approaches. We introduce NetMix2, a principled algorithm for identifying altered subnetworks from a wide range of subnetwork families. When using the propagation family, NetMix2 combines the advantages of the subnetwork family and network propagation approaches. NetMix2 outperforms other methods, including network propagation on simulated data, pan-cancer somatic mutation data, and genome-wide association data from multiple human diseases.

Assuntos

Estudo de Associação Genômica Ampla , Neoplasias , Humanos , Biologia Computacional/métodos , Algoritmos

17.

A path to translation: How 3D patient tumor avatars enable next generation precision oncology.

Bose, Shree; Barroso, Margarida; Chheda, Milan G; Clevers, Hans; Elez, Elena; Kaochar, Salma; Kopetz, Scott E; Li, Xiao-Nan; Meric-Bernstam, Funda; Meyer, Clifford A; Mou, Haiwei; Naegle, Kristen M; Pera, Martin F; Perova, Zinaida; Politi, Katerina A; Raphael, Benjamin J; Robson, Paul; Sears, Rosalie C; Tabernero, Josep; Tuveson, David A; Welm, Alana L; Welm, Bryan E; Willey, Christopher D; Salnikow, Konstantin; Chuang, Jeffrey H; Shen, Xiling.

Cancer Cell ; 40(12): 1448-1453, 2022 12 12.

Artigo em Inglês | MEDLINE | ID: mdl-36270276

RESUMO

3D patient tumor avatars (3D-PTAs) hold promise for next-generation precision medicine. Here, we describe the benefits and challenges of 3D-PTA technologies and necessary future steps to realize their potential for clinical decision making. 3D-PTAs require standardization criteria and prospective trials to establish clinical benefits. Innovative trial designs that combine omics and 3D-PTA readouts may lead to more accurate clinical predictors, and an integrated platform that combines diagnostic and therapeutic development will accelerate new treatments for patients with refractory disease.

Assuntos

Neoplasias , Humanos , Neoplasias/genética , Neoplasias/terapia , Neoplasias/diagnóstico , Medicina de Precisão , Estudos Prospectivos , Oncologia

18.

Convergent evolution and multi-wave clonal invasion in H3 K27-altered diffuse midline gliomas treated with a PDGFR inhibitor.

Arunachalam, Sasi; Szlachta, Karol; Brady, Samuel W; Ma, Xiaotu; Ju, Bensheng; Shaner, Bridget; Mulder, Heather L; Easton, John; Raphael, Benjamin J; Myers, Matthew; Tinkle, Christopher; Allen, Sariah J; Orr, Brent A; Wetmore, Cynthia J; Baker, Suzanne J; Zhang, Jinghui.

Acta Neuropathol Commun ; 10(1): 80, 2022 05 31.

Artigo em Inglês | MEDLINE | ID: mdl-35642016

RESUMO

The majority of diffuse midline gliomas, H3 K27-altered (DMG-H3 K27-a), are infiltrating pediatric brain tumors that arise in the pons with no effective treatment. To understand how clonal evolution contributes to the tumor's invasive spread, we performed exome sequencing and SNP array profiling on 49 multi-region autopsy samples from 11 patients with pontine DMG-H3 K27-a enrolled in a phase I clinical trial of PDGFR inhibitor crenolanib. For each patient, a phylogenetic tree was constructed by testing multiple possible clonal evolution models to select the one consistent with somatic mutations and copy number variations across all tumor regions. The tree was then used to deconvolute subclonal composition and prevalence at each tumor region to study convergent evolution and invasion patterns. Somatic variants in the PI3K pathway, a late event, are enriched in our cohort, affecting 70% of patients. Convergent evolution of PI3K at distinct phylogenetic branches was detected in 40% of the patients. 24 (~ 50%) of tumor regions were occupied by subclones of mixed lineages with varying molecular ages, indicating multiple waves of invasion across the pons and extrapontine. Subclones harboring a PDGFRA amplicon, including one that amplified a PDGRFAY849C mutant allele, were detected in four patients; their presence in extrapontine tumor and normal brain samples imply their involvement in extrapontine invasion. Our study expands the current knowledge on tumor invasion patterns in DMG-H3 K27-a, which may inform the design of future clinical trials.

Assuntos

Variações do Número de Cópias de DNA , Glioma , Criança , Glioma/tratamento farmacológico , Glioma/genética , Glioma/patologia , Histonas/genética , Humanos , Mutação/genética , Fosfatidilinositol 3-Quinases/genética , Filogenia , Inibidores de Proteínas Quinases

19.

SuperDendrix algorithm integrates genetic dependencies and genomic alterations across pathways and cancer types.

Park, Tae Yoon; Leiserson, Mark D M; Klau, Gunnar W; Raphael, Benjamin J.

Cell Genom ; 2(2)2022 Feb 09.

Artigo em Inglês | MEDLINE | ID: mdl-35382456

RESUMO

Recent genome-wide CRISPR-Cas9 loss-of-function screens have identified genetic dependencies across many cancer cell lines. Associations between these dependencies and genomic alterations in the same cell lines reveal phenomena such as oncogene addiction and synthetic lethality. However, comprehensive identification of such associations is complicated by complex interactions between genes across genetically heterogeneous cancer types. We introduce and apply the algorithm SuperDendrix to CRISPR-Cas9 loss-of-function screens from 769 cancer cell lines, to identify differential dependencies across cell lines and to find associations between differential dependencies and combinations of genomic alterations and cell-type-specific markers. These associations respect the position and type of interactions within pathways: for example, we observe increased dependencies on downstream activators of pathways, such as NFE2L2, and decreased dependencies on upstream activators of pathways, such as CDK6. SuperDendrix also reveals dozens of dependencies on lineage-specific transcription factors, identifies cancer-type-specific correlations between dependencies, and enables annotation of individual mutated residues.

20.

DeCiFering the elusive cancer cell fraction in tumor heterogeneity and evolution.

Satas, Gryte; Zaccaria, Simone; El-Kebir, Mohammed; Raphael, Benjamin J.

Cell Syst ; 12(10): 1004-1018.e10, 2021 10 20.

Artigo em Inglês | MEDLINE | ID: mdl-34416171

RESUMO

The cancer cell fraction (CCF), or proportion of cancerous cells in a tumor containing a single-nucleotide variant (SNV), is a fundamental statistic used to quantify tumor heterogeneity and evolution. Existing CCF estimation methods from bulk DNA sequencing data assume that every cell with an SNV contains the same number of copies of the SNV. This assumption is unrealistic in tumors with copy-number aberrations that alter SNV multiplicities. Furthermore, the CCF does not account for SNV losses due to copy-number aberrations, confounding downstream phylogenetic analyses. We introduce DeCiFer, an algorithm that overcomes these limitations by clustering SNVs using a novel statistic, the descendant cell fraction (DCF). The DCF quantifies both the prevalence of an SNV at the present time and its past evolutionary history using an evolutionary model that allows mutation losses. We show that DeCiFer yields more parsimonious reconstructions of tumor evolution than previously reported for 49 prostate cancer samples.

Assuntos

Neoplasias , Polimorfismo de Nucleotídeo Único , Algoritmos , Humanos , Masculino , Neoplasias/genética , Neoplasias/patologia , Filogenia , Polimorfismo de Nucleotídeo Único/genética , Análise de Sequência de DNA

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA