Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 129
Filtrar
1.
bioRxiv ; 2024 Apr 27.
Artículo en Inglés | MEDLINE | ID: mdl-38712136

RESUMEN

A key challenge in cancer genomics is understanding the functional relationships and dependencies between combinations of somatic mutations that drive cancer development. Such driver mutations frequently exhibit patterns of mutual exclusivity or co-occurrence across tumors, and many methods have been developed to identify such dependency patterns from bulk DNA sequencing data of a cohort of patients. However, while mutual exclusivity and co-occurrence are described as properties of driver mutations, existing methods do not explicitly disentangle functional, driver mutations from neutral, passenger mutations. In particular, nearly all existing methods evaluate mutual exclusivity or co-occurrence at the gene level, marking a gene as mutated if any mutation - driver or passenger - is present. Since some genes have a large number of passenger mutations, existing methods either restrict their analyses to a small subset of suspected driver genes - limiting their ability to identify novel dependencies - or make spurious inferences of mutual exclusivity and co-occurrence involving genes with many passenger mutations. We introduce DIALECT, an algorithm to identify dependencies between pairs of driver mutations from somatic mutation counts. We derive a latent variable mixture model for drivers and passengers that combines existing probabilistic models of passenger mutation rates with a latent variable describing the unknown status of a mutation as a driver or passenger. We use an expectation maximization (EM) algorithm to estimate the parameters of our model, including the rates of mutually exclusivity and co-occurrence between drivers. We demonstrate that DIALECT more accurately infers mutual exclusivity and co-occurrence between driver mutations compared to existing methods on both simulated mutation data and somatic mutation data from 5 cancer types in The Cancer Genome Atlas (TCGA).

2.
bioRxiv ; 2024 Mar 10.
Artículo en Inglés | MEDLINE | ID: mdl-38496660

RESUMEN

Spatially resolved transcriptomics (SRT) measures mRNA transcripts at thousands of locations within a tissue slice, revealing spatial variations in gene expression and distribution of cell types. In recent studies, SRT has been applied to tissue slices from multiple timepoints during the development of an organism. Alignment of this spatiotemporal transcriptomics data can provide insights into the gene expression programs governing the growth and differentiation of cells over space and time. We introduce DeST-OT (Developmental SpatioTemporal Optimal Transport), a method to align SRT slices from pairs of developmental timepoints using the framework of optimal transport (OT). DeST-OT uses semi-relaxed optimal transport to precisely model cellular growth, death, and differentiation processes that are not well-modeled by existing alignment methods. We demonstrate the advantage of DeST-OT on simulated slices. We further introduce two metrics to quantify the plausibility of a spatiotemporal alignment: a growth distortion metric which quantifies the discrepancy between the inferred and the true cell type growth rates, and a migration metric which quantifies the distance traveled between ancestor and descendant cells. DeST-OT outperforms existing methods on these metrics in the alignment of spatiotemporal transcriptomics data from the development of axolotl brain.

3.
bioRxiv ; 2024 Mar 23.
Artículo en Inglés | MEDLINE | ID: mdl-38496496

RESUMEN

Recent dynamic lineage tracing technologies combine CRISPR-based genome editing with single-cell sequencing to track cell divisions during development. A key computational problem in dynamic lineage tracing is to infer a cell lineage tree from the measured CRISPR-induced mutations. Three features of dynamic lineage tracing data distinguish this problem from standard phylogenetic tree inference. First, the CRISPR-editing process modifies a genomic location exactly once. This non-modifiable property is not well described by the time-reversible models commonly used in phylogenetics. Second, as a consequence of non-modifiability, the number of mutations per time unit decreases over time. Third, CRISPR-based genome-editing and single-cell sequencing results in high rates of both heritable and non-heritable (dropout) missing data. To model these features, we introduce the Probabilistic Mixed-type Missing (PMM) model. We describe an algorithm, LAML (Lineage Analysis via Maximum Likelihood), to search for the maximum likelihood (ML) tree under the PMM model. LAML combines an Expectation Maximization (EM) algorithm with a heuristic tree search to jointly estimate tree topology, branch lengths and missing data parameters. We derive a closed-form solution for the M-step in the case of no heritable missing data, and a block coordinate ascent approach in the general case which is more efficient than the standard General Time Reversible (GTR) phylogenetic model. On simulated data, LAML infers more accurate tree topologies and branch lengths than existing methods, with greater advantages on datasets with higher ratios of heritable to non-heritable missing data. We show that LAML provides unbiased time-scaled estimates of branch lengths. In contrast, we demonstrate that maximum parsimony methods for lineage tracing data not only underestimate branch lengths, but also yield branch lengths which are not proportional to time, due to the nonlinear decay in the number of mutations on branches further from the root. On lineage tracing data from a mouse model of lung adenocarcinoma, we show that LAML infers phylogenetic distances that are more concordant with gene expression data compared to distances derived from maximum parsimony. The LAML tree topology is more plausible than existing published trees, with fewer total cell migrations between distant metastases and fewer reseeding events where cells migrate back to the primary tumor. Crucially, we identify three distinct time epochs of metastasis progression, which includes a burst of metastasis events to various anatomical sites during a single month.

4.
Genome Biol ; 24(1): 272, 2023 Nov 30.
Artículo en Inglés | MEDLINE | ID: mdl-38037115

RESUMEN

A tumor contains a diverse collection of somatic mutations that reflect its past evolutionary history and that range in scale from single nucleotide variants (SNVs) to large-scale copy-number aberrations (CNAs). However, no current single-cell DNA sequencing (scDNA-seq) technology produces accurate measurements of both SNVs and CNAs, complicating the inference of tumor phylogenies. We introduce a new evolutionary model, the constrained k-Dollo model, that uses SNVs as phylogenetic markers but constrains losses of SNVs according to clusters of cells. We derive an algorithm, ConDoR, that infers phylogenies from targeted scDNA-seq data using this model. We demonstrate the advantages of ConDoR on simulated and real scDNA-seq data.


Asunto(s)
Neoplasias , Humanos , Animales , Filogenia , Neoplasias/genética , Mutación , Algoritmos , Análisis de Secuencia de ADN , Aves/genética , Variaciones en el Número de Copia de ADN
5.
Cell Syst ; 14(12): 1113-1121.e9, 2023 12 20.
Artículo en Inglés | MEDLINE | ID: mdl-38128483

RESUMEN

CRISPR-Cas9-based genome editing combined with single-cell sequencing enables the tracing of the history of cell divisions, or cellular lineage, in tissues and whole organisms. Although standard phylogenetic approaches may be applied to reconstruct cellular lineage trees from this data, the unique features of the CRISPR-Cas9 editing process motivate the development of specialized models that describe the evolution of CRISPR-Cas9-induced mutations. Here, we introduce the "star homoplasy" evolutionary model that constrains a phylogenetic character to mutate at most once along a lineage, capturing the "non-modifiability" property of CRISPR-Cas9 mutations. We derive a combinatorial characterization of star homoplasy phylogenies and use this characterization to develop an algorithm, "Startle", that computes a maximum parsimony star homoplasy phylogeny. We demonstrate that Startle infers more accurate phylogenies on simulated lineage tracing data compared with existing methods and finds parsimonious phylogenies with fewer metastatic migrations on lineage tracing data from mouse metastatic lung adenocarcinoma.


Asunto(s)
Sistemas CRISPR-Cas , Edición Génica , Animales , Ratones , Sistemas CRISPR-Cas/genética , Filogenia , Edición Génica/métodos , Linaje de la Célula/genética , Mutación
6.
PLoS Comput Biol ; 19(11): e1011590, 2023 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-37943952

RESUMEN

MOTIVATION: New low-coverage single-cell DNA sequencing technologies enable the measurement of copy number profiles from thousands of individual cells within tumors. From this data, one can infer the evolutionary history of the tumor by modeling transformations of the genome via copy number aberrations. Copy number aberrations alter multiple adjacent genomic loci, violating the standard phylogenetic assumption that loci evolve independently. Thus, specialized models to infer copy number phylogenies have been introduced. A widely used model is the copy number transformation (CNT) model in which a genome is represented by an integer vector and a copy number aberration is an event that either increases or decreases the number of copies of a contiguous segment of the genome. The CNT distance between a pair of copy number profiles is the minimum number of events required to transform one profile to another. While this distance can be computed efficiently, no efficient algorithm has been developed to find the most parsimonious phylogeny under the CNT model. RESULTS: We introduce the zero-agnostic copy number transformation (ZCNT) model, a simplification of the CNT model that allows the amplification or deletion of regions with zero copies. We derive a closed form expression for the ZCNT distance between two copy number profiles and show that, unlike the CNT distance, the ZCNT distance forms a metric. We leverage the closed-form expression for the ZCNT distance and an alternative characterization of copy number profiles to derive polynomial time algorithms for two natural relaxations of the small parsimony problem on copy number profiles. While the alteration of zero copy number regions allowed under the ZCNT model is not biologically realistic, we show on both simulated and real datasets that the ZCNT distance is a close approximation to the CNT distance. Extending our polynomial time algorithm for the ZCNT small parsimony problem, we develop an algorithm, Lazac, for solving the large parsimony problem on copy number profiles. We demonstrate that Lazac outperforms existing methods for inferring copy number phylogenies on both simulated and real data.


Asunto(s)
Variaciones en el Número de Copia de ADN , Neoplasias , Humanos , Filogenia , Variaciones en el Número de Copia de ADN/genética , Neoplasias/genética , Genómica/métodos , Genoma , Algoritmos
7.
Nature ; 623(7986): 432-441, 2023 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-37914932

RESUMEN

Chromatin accessibility is essential in regulating gene expression and cellular identity, and alterations in accessibility have been implicated in driving cancer initiation, progression and metastasis1-4. Although the genetic contributions to oncogenic transitions have been investigated, epigenetic drivers remain less understood. Here we constructed a pan-cancer epigenetic and transcriptomic atlas using single-nucleus chromatin accessibility data (using single-nucleus assay for transposase-accessible chromatin) from 225 samples and matched single-cell or single-nucleus RNA-sequencing expression data from 206 samples. With over 1 million cells from each platform analysed through the enrichment of accessible chromatin regions, transcription factor motifs and regulons, we identified epigenetic drivers associated with cancer transitions. Some epigenetic drivers appeared in multiple cancers (for example, regulatory regions of ABCC1 and VEGFA; GATA6 and FOX-family motifs), whereas others were cancer specific (for example, regulatory regions of FGF19, ASAP2 and EN1, and the PBX3 motif). Among epigenetically altered pathways, TP53, hypoxia and TNF signalling were linked to cancer initiation, whereas oestrogen response, epithelial-mesenchymal transition and apical junction were tied to metastatic transition. Furthermore, we revealed a marked correlation between enhancer accessibility and gene expression and uncovered cooperation between epigenetic and genetic drivers. This atlas provides a foundation for further investigation of epigenetic dynamics in cancer transitions.


Asunto(s)
Epigénesis Genética , Regulación Neoplásica de la Expresión Génica , Neoplasias , Humanos , Hipoxia de la Célula , Núcleo Celular , Cromatina/genética , Cromatina/metabolismo , Elementos de Facilitación Genéticos/genética , Epigénesis Genética/genética , Transición Epitelial-Mesenquimal , Estrógenos/metabolismo , Perfilación de la Expresión Génica , Proteínas Activadoras de GTPasa/metabolismo , Metástasis de la Neoplasia , Neoplasias/clasificación , Neoplasias/genética , Neoplasias/patología , Secuencias Reguladoras de Ácidos Nucleicos/genética , Análisis de la Célula Individual , Factores de Transcripción/metabolismo
8.
bioRxiv ; 2023 Oct 13.
Artículo en Inglés | MEDLINE | ID: mdl-37873258

RESUMEN

Spatially resolved transcriptomics technologies provide high-throughput measurements of gene expression in a tissue slice, but the sparsity of this data complicates the analysis of spatial gene expression patterns such as gene expression gradients. We address these issues by deriving a topographic map of a tissue slice-analogous to a map of elevation in a landscape-using a novel quantity called the isodepth. Contours of constant isodepth enclose spatial domains with distinct cell type composition, while gradients of the isodepth indicate spatial directions of maximum change in gene expression. We develop GASTON, an unsupervised and interpretable deep learning algorithm that simultaneously learns the isodepth, spatial gene expression gradients, and piecewise linear functions of the isodepth that model both continuous gradients and discontinuous spatial variation in the expression of individual genes. We validate GASTON by showing that it accurately identifies spatial domains and marker genes across several biological systems. In SRT data from the brain, GASTON reveals gradients of neuronal differentiation and firing, and in SRT data from a tumor sample, GASTON infers gradients of metabolic activity and epithelial-mesenchymal transition (EMT)-related gene expression in the tumor microenvironment.

9.
NAR Cancer ; 5(3): zcad045, 2023 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-37636316

RESUMEN

Androgen receptor (AR) inhibition is standard of care for advanced prostate cancer (PC). However, efficacy is limited by progression to castration-resistant PC (CRPC), usually due to AR re-activation via mechanisms that include AR amplification and structural rearrangement. These two classes of AR alterations often co-occur in CRPC tumors, but it is unclear whether this reflects intercellular or intracellular heterogeneity of AR. Resolving this is important for developing new therapies and predictive biomarkers. Here, we analyzed 41 CRPC tumors and 6 patient-derived xenografts (PDXs) using linked-read DNA-sequencing, and identified 7 tumors that developed complex, multiply-rearranged AR gene structures in conjunction with very high AR copy number. Analysis of PDX models by optical genome mapping and fluorescence in situ hybridization showed that AR residing on extrachromosomal DNA (ecDNA) was an underlying mechanism, and was associated with elevated levels and diversity of AR expression. This study identifies co-evolution of AR gene copy number and structural complexity via ecDNA as a mechanism associated with endocrine therapy resistance.

10.
Genome Res ; 33(7): 1124-1132, 2023 07.
Artículo en Inglés | MEDLINE | ID: mdl-37553263

RESUMEN

Spatially resolved transcriptomics (SRT) technologies measure messenger RNA (mRNA) expression at thousands of locations in a tissue slice. However, nearly all SRT technologies measure expression in two-dimensional (2D) slices extracted from a 3D tissue, thus losing information that is shared across multiple slices from the same tissue. Integrating SRT data across multiple slices can help recover this information and improve downstream expression analyses, but multislice alignment and integration remains a challenging task. Existing methods for integrating SRT data either do not use spatial information or assume that the morphology of the tissue is largely preserved across slices, an assumption that is often violated because of biological or technical reasons. We introduce PASTE2, a method for partial alignment and 3D reconstruction of multislice SRT data sets, allowing only partial overlap between aligned slices and/or slice-specific cell types. PASTE2 formulates a novel partial fused Gromov-Wasserstein optimal transport problem, which we solve using a conditional gradient algorithm. PASTE2 includes a model selection procedure to estimate the fraction of overlap between slices, and optionally uses information from histological images that accompany some SRT experiments. We show on both simulated and real data that PASTE2 obtains more accurate alignments than existing methods. We further use PASTE2 to reconstruct a 3D map of gene expression in a Drosophila embryo from a 16 slice Stereo-seq data set. PASTE2 produces accurate alignments of multislice data sets from multiple SRT technologies, enabling detailed studies of spatial gene expression across a wide range of biological applications.


Asunto(s)
Algoritmos , Transcriptoma
11.
bioRxiv ; 2023 Jul 15.
Artículo en Inglés | MEDLINE | ID: mdl-37502835

RESUMEN

Multi-region DNA sequencing of primary tumors and metastases from individual patients helps identify somatic aberrations driving cancer development. However, most methods to infer copy-number aberrations (CNAs) analyze individual samples. We introduce HATCHet2 to identify haplotype- and clone-specific CNAs simultaneously from multiple bulk samples. HATCHet2 introduces a novel statistic, the mirrored haplotype B-allele frequency (mhBAF), to identify mirrored-subclonal CNAs having different numbers of copies of parental haplotypes in different tumor clones. HATCHet2 also has high accuracy in identifying focal CNAs and extends the earlier HATCHet method in several directions. We demonstrate HATCHet2's improved accuracy using simulations and a single-cell sequencing dataset. HATCHet2 analysis of 50 prostate cancer samples from 10 patients reveals previously-unreported mirrored-subclonal CNAs affecting cancer genes.

12.
bioRxiv ; 2023 Apr 12.
Artículo en Inglés | MEDLINE | ID: mdl-37090633

RESUMEN

Motivation: New low-coverage single-cell DNA sequencing technologies enable the measurement of copy number profiles from thousands of individual cells within tumors. From this data, one can infer the evolutionary history of the tumor by modeling transformations of the genome via copy number aberrations. A widely used model to infer such copy number phylogenies is the copy number transformation (CNT) model in which a genome is represented by an integer vector and a copy number aberration is an event that either increases or decreases the number of copies of a contiguous segment of the genome. The CNT distance between a pair of copy number profiles is the minimum number of events required to transform one profile to another. While this distance can be computed efficiently, no efficient algorithm has been developed to find the most parsimonious phylogeny under the CNT model. Results: We introduce the zero-agnostic copy number transformation (ZCNT) model, a simplification of the CNT model that allows the amplification or deletion of regions with zero copies. We derive a closed form expression for the ZCNT distance between two copy number profiles and show that, unlike the CNT distance, the ZCNT distance forms a metric. We leverage the closed-form expression for the ZCNT distance and an alternative characterization of copy number profiles to derive polynomial time algorithms for two natural relaxations of the small parsimony problem on copy number profiles. While the alteration of zero copy number regions allowed under the ZCNT model is not biologically realistic, we show on both simulated and real datasets that the ZCNT distance is a close approximation to the CNT distance. Extending our polynomial time algorithm for the ZCNT small parsimony problem, we develop an algorithm, Lazac, for solving the large parsimony problem on copy number profiles. We demonstrate that Lazac outperforms existing methods for inferring copy number phylogenies on both simulated and real data.

13.
Cancer Res Commun ; 3(4): 564-575, 2023 04.
Artículo en Inglés | MEDLINE | ID: mdl-37066022

RESUMEN

Osteosarcoma is an aggressive malignancy characterized by high genomic complexity. Identification of few recurrent mutations in protein coding genes suggests that somatic copy-number aberrations (SCNA) are the genetic drivers of disease. Models around genomic instability conflict-it is unclear whether osteosarcomas result from pervasive ongoing clonal evolution with continuous optimization of the fitness landscape or an early catastrophic event followed by stable maintenance of an abnormal genome. We address this question by investigating SCNAs in >12,000 tumor cells obtained from human osteosarcomas using single-cell DNA sequencing, with a degree of precision and accuracy not possible when inferring single-cell states using bulk sequencing. Using the CHISEL algorithm, we inferred allele- and haplotype-specific SCNAs from this whole-genome single-cell DNA sequencing data. Surprisingly, despite extensive structural complexity, these tumors exhibit a high degree of cell-cell homogeneity with little subclonal diversification. Longitudinal analysis of patient samples obtained at distant therapeutic timepoints (diagnosis, relapse) demonstrated remarkable conservation of SCNA profiles over tumor evolution. Phylogenetic analysis suggests that the majority of SCNAs were acquired early in the oncogenic process, with relatively few structure-altering events arising in response to therapy or during adaptation to growth in metastatic tissues. These data further support the emerging hypothesis that early catastrophic events, rather than sustained genomic instability, give rise to structural complexity, which is then preserved over long periods of tumor developmental time. Significance: Chromosomally complex tumors are often described as genomically unstable. However, determining whether complexity arises from remote time-limited events that give rise to structural alterations or a progressive accumulation of structural events in persistently unstable tumors has implications for diagnosis, biomarker assessment, mechanisms of treatment resistance, and represents a conceptual advance in our understanding of intratumoral heterogeneity and tumor evolution.


Asunto(s)
Neoplasias Óseas , Osteosarcoma , Humanos , Filogenia , Variaciones en el Número de Copia de ADN/genética , Recurrencia Local de Neoplasia , Osteosarcoma/genética , Inestabilidad Genómica/genética , Neoplasias Óseas/genética
14.
Nature ; 616(7955): 113-122, 2023 04.
Artículo en Inglés | MEDLINE | ID: mdl-36922587

RESUMEN

Emerging spatial technologies, including spatial transcriptomics and spatial epigenomics, are becoming powerful tools for profiling of cellular states in the tissue context1-5. However, current methods capture only one layer of omics information at a time, precluding the possibility of examining the mechanistic relationship across the central dogma of molecular biology. Here, we present two technologies for spatially resolved, genome-wide, joint profiling of the epigenome and transcriptome by cosequencing chromatin accessibility and gene expression, or histone modifications (H3K27me3, H3K27ac or H3K4me3) and gene expression on the same tissue section at near-single-cell resolution. These were applied to embryonic and juvenile mouse brain, as well as adult human brain, to map how epigenetic mechanisms control transcriptional phenotype and cell dynamics in tissue. Although highly concordant tissue features were identified by either spatial epigenome or spatial transcriptome we also observed distinct patterns, suggesting their differential roles in defining cell states. Linking epigenome to transcriptome pixel by pixel allows the uncovering of new insights in spatial epigenetic priming, differentiation and gene regulation within the tissue architecture. These technologies are of great interest in life science and biomedical research.


Asunto(s)
Cromatina , Epigenoma , Mamíferos , Transcriptoma , Animales , Humanos , Ratones , Cromatina/genética , Cromatina/metabolismo , Epigénesis Genética , Epigenómica , Perfilación de la Expresión Génica , Regulación de la Expresión Génica , Mamíferos/genética , Histonas/química , Histonas/metabolismo , Análisis de la Célula Individual , Especificidad de Órganos , Encéfalo/embriología , Encéfalo/metabolismo , Envejecimiento/genética
15.
bioRxiv ; 2023 Jan 06.
Artículo en Inglés | MEDLINE | ID: mdl-36711528

RESUMEN

Tumors consist of subpopulations of cells that harbor distinct collections of somatic mutations. These mutations range in scale from single nucleotide variants (SNVs) to large-scale copy-number aberrations (CNAs). While many approaches infer tumor phylogenies using SNVs as phylogenetic markers, CNAs that overlap SNVs may lead to erroneous phylogenetic inference. Specifically, an SNV may be lost in a cell due to a deletion of the genomic segment containing the SNV. Unfortunately, no current single-cell DNA sequencing (scDNA-seq) technology produces accurate measurements of both SNVs and CNAs. For instance, recent targeted scDNA-seq technologies, such as Mission Bio Tapestri, measure SNVs with high fidelity in individual cells, but yield much less reliable measurements of CNAs. We introduce a new evolutionary model, the constrained k-Dollo model, that uses SNVs as phylogenetic markers and partial information about CNAs in the form of clustering of cells with similar copy-number profiles. This copy-number clustering constrains where loss of SNVs can occur in the phylogeny. We develop ConDoR (Constrained Dollo Reconstruction), an algorithm to infer tumor phylogenies from targeted scDNA-seq data using the constrained k-Dollo model. We show that ConDoR outperforms existing methods on simulated data. We use ConDoR to analyze a new multi-region targeted scDNA-seq dataset of 2153 cells from a pancreatic ductal adenocarcinoma (PDAC) tumor and produce a more plausible phylogeny compared to existing methods that conforms to histological results for the tumor from a previous study. We also analyze a metastatic colorectal cancer dataset, deriving a more parsimonious phylogeny than previously published analyses and with a simpler monoclonal origin of metastasis compared to the original study. Code availability: Software is available at https://github.com/raphael-group/constrained-Dollo.

16.
bioRxiv ; 2023 Jan 08.
Artículo en Inglés | MEDLINE | ID: mdl-36711750

RESUMEN

Spatially resolved transcriptomics (SRT) technologies measure mRNA expression at thousands of locations in a tissue slice. However, nearly all SRT technologies measure expression in two dimensional slices extracted from a three-dimensional tissue, thus losing information that is shared across multiple slices from the same tissue. Integrating SRT data across multiple slices can help recover this information and improve downstream expression analyses, but multi-slice alignment and integration remains a challenging task. Existing methods for integrating SRT data either do not use spatial information or assume that the morphology of the tissue is largely preserved across slices, an assumption that is often violated due to biological or technical reasons. We introduce PASTE2, a method for partial alignment and 3D reconstruction of multi-slice SRT datasets, allowing only partial overlap between aligned slices and/or slice-specific cell types. PASTE2 formulates a novel partial Fused Gromov-Wasserstein Optimal Transport problem, which we solve using a conditional gradient algorithm. PASTE2 includes a model selection procedure to estimate the fraction of overlap between slices, and optionally uses information from histological images that accompany some SRT experiments. We show on both simulated and real data that PASTE2 obtains more accurate alignments than existing methods. We further use PASTE2 to reconstruct a 3D map of gene expression in a Drosophila embryo from a 16 slice Stereo-seq dataset. PASTE2 produces accurate alignments of multi-slice datasets from multiple SRT technologies, enabling detailed studies of spatial gene expression across a wide range of biological applications. Code availability: Software is available at https://github.com/raphael-group/paste2.

18.
J Comput Biol ; 29(12): 1305-1323, 2022 12.
Artículo en Inglés | MEDLINE | ID: mdl-36525308

RESUMEN

A standard paradigm in computational biology is to leverage interaction networks as prior knowledge in analyzing high-throughput biological data, where the data give a score for each vertex in the network. One classical approach is the identification of altered subnetworks, or subnetworks of the interaction network that have both outlier vertex scores and a defined network topology. One class of algorithms for identifying altered subnetworks search for high-scoring subnetworks in subnetwork families with simple topological constraints, such as connected subnetworks, and have sound statistical guarantees. A second class of algorithms employ network propagation-the smoothing of vertex scores over the network using a random walk or diffusion process-and utilize the global structure of the network. However, network propagation algorithms often rely on ad hoc heuristics that lack a rigorous statistical foundation. In this work, we unify the subnetwork family and network propagation approaches by deriving the propagation family, a subnetwork family that approximates the sets of vertices ranked highly by network propagation approaches. We introduce NetMix2, a principled algorithm for identifying altered subnetworks from a wide range of subnetwork families. When using the propagation family, NetMix2 combines the advantages of the subnetwork family and network propagation approaches. NetMix2 outperforms other methods, including network propagation on simulated data, pan-cancer somatic mutation data, and genome-wide association data from multiple human diseases.


Asunto(s)
Estudio de Asociación del Genoma Completo , Neoplasias , Humanos , Biología Computacional/métodos , Algoritmos
19.
Cell Syst ; 13(10): 786-797.e13, 2022 10 19.
Artículo en Inglés | MEDLINE | ID: mdl-36265465

RESUMEN

Spatially resolved transcriptomics (SRT) technologies measure gene expression at known locations in a tissue slice, enabling the identification of spatially varying genes or cell types. Current approaches for these tasks assume either that gene expression varies continuously across a tissue or that a tissue contains a small number of regions with distinct cellular composition. We propose a model for SRT data from layered tissues that includes both continuous and discrete spatial variation in expression and an algorithm, Belayer, to learn the parameters of this model. Belayer models gene expression as a piecewise linear function of the relative depth of a tissue layer with possible discontinuities at layer boundaries. We use conformal maps to model relative depth and derive a dynamic programming algorithm to infer layer boundaries and gene expression functions. Belayer accurately identifies tissue layers and biologically meaningful spatially varying genes in SRT data from the brain and skin.


Asunto(s)
Algoritmos , Transcriptoma , Transcriptoma/genética
20.
Cancer Cell ; 40(12): 1448-1453, 2022 12 12.
Artículo en Inglés | MEDLINE | ID: mdl-36270276

RESUMEN

3D patient tumor avatars (3D-PTAs) hold promise for next-generation precision medicine. Here, we describe the benefits and challenges of 3D-PTA technologies and necessary future steps to realize their potential for clinical decision making. 3D-PTAs require standardization criteria and prospective trials to establish clinical benefits. Innovative trial designs that combine omics and 3D-PTA readouts may lead to more accurate clinical predictors, and an integrated platform that combines diagnostic and therapeutic development will accelerate new treatments for patients with refractory disease.


Asunto(s)
Neoplasias , Humanos , Neoplasias/genética , Neoplasias/terapia , Neoplasias/diagnóstico , Medicina de Precisión , Estudios Prospectivos , Oncología Médica
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...