Search | VHL Regional Portal

Evolutionary signatures of human cancers revealed via genomic analysis of over 35,000 patients.

Fontana, Diletta; Crespiatico, Ilaria; Crippa, Valentina; Malighetti, Federica; Villa, Matteo; Angaroni, Fabrizio; De Sano, Luca; Aroldi, Andrea; Antoniotti, Marco; Caravagna, Giulio; Piazza, Rocco; Graudenzi, Alex; Mologni, Luca; Ramazzotti, Daniele.

Nat Commun ; 14(1): 5982, 2023 09 25.

Article in English | MEDLINE | ID: mdl-37749078

ABSTRACT

Recurring sequences of genomic alterations occurring across patients can highlight repeated evolutionary processes with significant implications for predicting cancer progression. Leveraging the ever-increasing availability of cancer omics data, here we unveil cancer's evolutionary signatures tied to distinct disease outcomes, representing "favored trajectories" of acquisition of driver mutations detected in patients with similar prognosis. We present a framework named ASCETIC (Agony-baSed Cancer EvoluTion InferenCe) to extract such signatures from sequencing experiments generated by different technologies such as bulk and single-cell sequencing data. We apply ASCETIC to (i) single-cell data from 146 myeloid malignancy patients and bulk sequencing from 366 acute myeloid leukemia patients, (ii) multi-region sequencing from 100 early-stage lung cancer patients, (iii) exome/genome data from 10,000+ Pan-Cancer Atlas samples, and (iv) targeted sequencing from 25,000+ MSK-MET metastatic patients, revealing subtype-specific single-nucleotide variant signatures associated with distinct prognostic clusters. Validations on several datasets underscore the robustness and generalizability of the extracted signatures.

Subject(s)

Genomics , Neoplasms , Humans , Neoplasms/genetics , Exome/genetics , Patients , Technology

Learning mutational graphs of individual tumour evolution from single-cell and multi-region sequencing data.

Ramazzotti, Daniele; Graudenzi, Alex; De Sano, Luca; Antoniotti, Marco; Caravagna, Giulio.

BMC Bioinformatics ; 20(1): 210, 2019 Apr 25.

Article in English | MEDLINE | ID: mdl-31023236

ABSTRACT

BACKGROUND: A large number of algorithms is being developed to reconstruct evolutionary models of individual tumours from genome sequencing data. Most methods can analyze multiple samples collected either through bulk multi-region sequencing experiments or the sequencing of individual cancer cells. However, rarely the same method can support both data types. RESULTS: We introduce TRaIT, a computational framework to infer mutational graphs that model the accumulation of multiple types of somatic alterations driving tumour evolution. Compared to other tools, TRaIT supports multi-region and single-cell sequencing data within the same statistical framework, and delivers expressive models that capture many complex evolutionary phenomena. TRaIT improves accuracy, robustness to data-specific errors and computational complexity compared to competing methods. CONCLUSIONS: We show that the application of TRaIT to single-cell and multi-region cancer datasets can produce accurate and reliable models of single-tumour evolution, quantify the extent of intra-tumour heterogeneity and generate new testable experimental hypotheses.

Subject(s)

Algorithms , Neoplasms/pathology , Computational Biology/methods , Evolution, Molecular , Humans , Mutation , Neoplasms/classification , Neoplasms/genetics , Sequence Analysis, DNA , Single-Cell Analysis

SIMLR: A Tool for Large-Scale Genomic Analyses by Multi-Kernel Learning.

Wang, Bo; Ramazzotti, Daniele; De Sano, Luca; Zhu, Junjie; Pierson, Emma; Batzoglou, Serafim.

Proteomics ; 18(2)2018 01.

Article in English | MEDLINE | ID: mdl-29265724

ABSTRACT

SIMLR (Single-cell Interpretation via Multi-kernel LeaRning), an open-source tool that implements a novel framework to learn a sample-to-sample similarity measure from expression data observed for heterogenous samples, is presented here. SIMLR can be effectively used to perform tasks such as dimension reduction, clustering, and visualization of heterogeneous populations of samples. SIMLR was benchmarked against state-of-the-art methods for these three tasks on several public datasets, showing it to be scalable and capable of greatly improving clustering performance, as well as providing valuable insights by making the data more interpretable via better a visualization. SIMLR is available on https://github.com/BatzoglouLabSU/SIMLRGitHub in both R and MATLAB implementations. Furthermore, it is also available as an R package on http://bioconductor.org.

Subject(s)

Genomics/methods , Machine Learning , Sequence Analysis, RNA/methods , Single-Cell Analysis/methods , Software , Algorithms , Humans

Erratum: OncoScore: a novel, Internet-based tool to assess the oncogenic potential of genes.

Piazza, Rocco; Ramazzotti, Daniele; Spinelli, Roberta; Pirola, Alessandra; De Sano, Luca; Ferrari, Pierangelo; Magistroni, Vera; Cordani, Nicoletta; Sharma, Nitesh; Gambacorti-Passerini, Carlo.

Sci Rep ; 7: 46823, 2017 05 22.

Article in English | MEDLINE | ID: mdl-28530230

ABSTRACT

This corrects the article DOI: 10.1038/srep46290.

OncoScore: a novel, Internet-based tool to assess the oncogenic potential of genes.

Piazza, Rocco; Ramazzotti, Daniele; Spinelli, Roberta; Pirola, Alessandra; De Sano, Luca; Ferrari, Pierangelo; Magistroni, Vera; Cordani, Nicoletta; Sharma, Nitesh; Gambacorti-Passerini, Carlo.

Sci Rep ; 7: 46290, 2017 04 07.

Article in English | MEDLINE | ID: mdl-28387367

ABSTRACT

The complicated, evolving landscape of cancer mutations poses a formidable challenge to identify cancer genes among the large lists of mutations typically generated in NGS experiments. The ability to prioritize these variants is therefore of paramount importance. To address this issue we developed OncoScore, a text-mining tool that ranks genes according to their association with cancer, based on available biomedical literature. Receiver operating characteristic curve and the area under the curve (AUC) metrics on manually curated datasets confirmed the excellent discriminating capability of OncoScore (OncoScore cut-off threshold = 21.09; AUC = 90.3%, 95% CI: 88.1-92.5%), indicating that OncoScore provides useful results in cases where an efficient prioritization of cancer-associated genes is needed.

Subject(s)

Genes, Neoplasm , Neoplasms/genetics , Software , Humans , Mutation

Algorithmic methods to infer the evolutionary trajectories in cancer progression.

Caravagna, Giulio; Graudenzi, Alex; Ramazzotti, Daniele; Sanz-Pamplona, Rebeca; De Sano, Luca; Mauri, Giancarlo; Moreno, Victor; Antoniotti, Marco; Mishra, Bud.

Proc Natl Acad Sci U S A ; 113(28): E4025-34, 2016 07 12.

Article in English | MEDLINE | ID: mdl-27357673

ABSTRACT

The genomic evolution inherent to cancer relates directly to a renewed focus on the voluminous next-generation sequencing data and machine learning for the inference of explanatory models of how the (epi)genomic events are choreographed in cancer initiation and development. However, despite the increasing availability of multiple additional -omics data, this quest has been frustrated by various theoretical and technical hurdles, mostly stemming from the dramatic heterogeneity of the disease. In this paper, we build on our recent work on the "selective advantage" relation among driver mutations in cancer progression and investigate its applicability to the modeling problem at the population level. Here, we introduce PiCnIc (Pipeline for Cancer Inference), a versatile, modular, and customizable pipeline to extract ensemble-level progression models from cross-sectional sequenced cancer genomes. The pipeline has many translational implications because it combines state-of-the-art techniques for sample stratification, driver selection, identification of fitness-equivalent exclusive alterations, and progression model inference. We demonstrate PiCnIc's ability to reproduce much of the current knowledge on colorectal cancer progression as well as to suggest novel experimentally verifiable hypotheses.

Subject(s)

Biological Evolution , Colorectal Neoplasms/genetics , Models, Genetic , Algorithms , Humans , Machine Learning , Microsatellite Repeats

TRONCO: an R package for the inference of cancer progression models from heterogeneous genomic data.

De Sano, Luca; Caravagna, Giulio; Ramazzotti, Daniele; Graudenzi, Alex; Mauri, Giancarlo; Mishra, Bud; Antoniotti, Marco.

Bioinformatics ; 32(12): 1911-3, 2016 06 15.

Article in English | MEDLINE | ID: mdl-26861821

ABSTRACT

MOTIVATION: We introduce TRanslational ONCOlogy (TRONCO), an open-source R package that implements the state-of-the-art algorithms for the inference of cancer progression models from (epi)genomic mutational profiles. TRONCO can be used to extract population-level models describing the trends of accumulation of alterations in a cohort of cross-sectional samples, e.g. retrieved from publicly available databases, and individual-level models that reveal the clonal evolutionary history in single cancer patients, when multiple samples, e.g. multiple biopsies or single-cell sequencing data, are available. The resulting models can provide key hints for uncovering the evolutionary trajectories of cancer, especially for precision medicine or personalized therapy. AVAILABILITY AND IMPLEMENTATION: TRONCO is released under the GPL license, is hosted at http://bimib.disco.unimib.it/ (Software section) and archived also at bioconductor.org. CONTACT: tronco@disco.unimib.it SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Subject(s)

Models, Theoretical , Neoplasms/genetics , Software , Algorithms , Disease Progression , Epigenesis, Genetic , Genomics , Humans , User-Computer Interface

Automatising the analysis of stochastic biochemical time-series.

Caravagna, Giulio; De Sano, Luca; Antoniotti, Marco.

BMC Bioinformatics ; 16 Suppl 9: S8, 2015.

Article in English | MEDLINE | ID: mdl-26051821

ABSTRACT

BACKGROUND: Mathematical and computational modelling of biochemical systems has seen a lot of effort devoted to the definition and implementation of high-performance mechanistic simulation frameworks. Within these frameworks it is possible to analyse complex models under a variety of configurations, eventually selecting the best setting of, e.g., parameters for a target system. MOTIVATION: This operational pipeline relies on the ability to interpret the predictions of a model, often represented as simulation time-series. Thus, an efficient data analysis pipeline is crucial to automatise time-series analyses, bearing in mind that errors in this phase might mislead the modeller's conclusions. RESULTS: For this reason we have developed an intuitive framework-independent Python tool to automate analyses common to a variety of modelling approaches. These include assessment of useful non-trivial statistics for simulation ensembles, e.g., estimation of master equations. Intuitive and domain-independent batch scripts will allow the researcher to automatically prepare reports, thus speeding up the usual model-definition, testing and refinement pipeline.

Subject(s)

Computational Biology , Computer Simulation , Models, Biological , Predatory Behavior , Software , Animals , Automation , Bacteria/genetics , Models, Statistical , Stochastic Processes

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL