Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
1.
Nat Commun ; 14(1): 5982, 2023 09 25.
Artículo en Inglés | MEDLINE | ID: mdl-37749078

RESUMEN

Recurring sequences of genomic alterations occurring across patients can highlight repeated evolutionary processes with significant implications for predicting cancer progression. Leveraging the ever-increasing availability of cancer omics data, here we unveil cancer's evolutionary signatures tied to distinct disease outcomes, representing "favored trajectories" of acquisition of driver mutations detected in patients with similar prognosis. We present a framework named ASCETIC (Agony-baSed Cancer EvoluTion InferenCe) to extract such signatures from sequencing experiments generated by different technologies such as bulk and single-cell sequencing data. We apply ASCETIC to (i) single-cell data from 146 myeloid malignancy patients and bulk sequencing from 366 acute myeloid leukemia patients, (ii) multi-region sequencing from 100 early-stage lung cancer patients, (iii) exome/genome data from 10,000+ Pan-Cancer Atlas samples, and (iv) targeted sequencing from 25,000+ MSK-MET metastatic patients, revealing subtype-specific single-nucleotide variant signatures associated with distinct prognostic clusters. Validations on several datasets underscore the robustness and generalizability of the extracted signatures.


Asunto(s)
Genómica , Neoplasias , Humanos , Neoplasias/genética , Exoma/genética , Pacientes , Tecnología
2.
BMC Bioinformatics ; 24(1): 99, 2023 Mar 17.
Artículo en Inglés | MEDLINE | ID: mdl-36932333

RESUMEN

BACKGROUND: Longitudinal single-cell sequencing experiments of patient-derived models are increasingly employed to investigate cancer evolution. In this context, robust computational methods are needed to properly exploit the mutational profiles of single cells generated via variant calling, in order to reconstruct the evolutionary history of a tumor and characterize the impact of therapeutic strategies, such as the administration of drugs. To this end, we have recently developed the LACE framework for the Longitudinal Analysis of Cancer Evolution. RESULTS: The LACE 2.0 release aimed at inferring longitudinal clonal trees enhances the original framework with new key functionalities: an improved data management for preprocessing of standard variant calling data, a reworked inference engine, and direct connection to public databases. CONCLUSIONS: All of this is accessible through a new and interactive Shiny R graphical interface offering the possibility to apply filters helpful in discriminating relevant or potential driver mutations, set up inferential parameters, and visualize the results. The software is available at: github.com/BIMIB-DISCo/LACE.


Asunto(s)
Neoplasias , Programas Informáticos , Humanos , Neoplasias/genética , Células Clonales
3.
BMC Bioinformatics ; 23(1): 269, 2022 Jul 08.
Artículo en Inglés | MEDLINE | ID: mdl-35804300

RESUMEN

BACKGROUND: The combined effects of biological variability and measurement-related errors on cancer sequencing data remain largely unexplored. However, the spatio-temporal simulation of multi-cellular systems provides a powerful instrument to address this issue. In particular, efficient algorithmic frameworks are needed to overcome the harsh trade-off between scalability and expressivity, so to allow one to simulate both realistic cancer evolution scenarios and the related sequencing experiments, which can then be used to benchmark downstream bioinformatics methods. RESULT: We introduce a Julia package for SPAtial Cancer Evolution (J-SPACE), which allows one to model and simulate a broad set of experimental scenarios, phenomenological rules and sequencing settings.Specifically, J-SPACE simulates the spatial dynamics of cells as a continuous-time multi-type birth-death stochastic process on a arbitrary graph, employing different rules of interaction and an optimised Gillespie algorithm. The evolutionary dynamics of genomic alterations (single-nucleotide variants and indels) is simulated either under the Infinite Sites Assumption or several different substitution models, including one based on mutational signatures. After mimicking the spatial sampling of tumour cells, J-SPACE returns the related phylogenetic model, and allows one to generate synthetic reads from several Next-Generation Sequencing (NGS) platforms, via the ART read simulator. The results are finally returned in standard FASTA, FASTQ, SAM, ALN and Newick file formats. CONCLUSION: J-SPACE is designed to efficiently simulate the heterogeneous behaviour of a large number of cancer cells and produces a rich set of outputs. Our framework is useful to investigate the emergent spatial dynamics of cancer subpopulations, as well as to assess the impact of incomplete sampling and of experiment-specific errors. Importantly, the output of J-SPACE is designed to allow the performance assessment of downstream bioinformatics pipelines processing NGS data. J-SPACE is freely available at: https://github.com/BIMIB-DISCo/J-Space.jl .


Asunto(s)
Neoplasias , Programas Informáticos , Simulación por Computador , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Neoplasias/genética , Neoplasias/patología , Filogenia
4.
STAR Protoc ; 3(3): 101513, 2022 09 16.
Artículo en Inglés | MEDLINE | ID: mdl-35779264

RESUMEN

We outline the features of the R package SparseSignatures and its application to determine the signatures contributing to mutation profiles of tumor samples. We describe installation details and illustrate a step-by-step approach to (1) prepare the data for signature analysis, (2) determine the optimal parameters, and (3) employ them to determine the signatures and related exposure levels in the point mutation dataset. For complete details on the use and execution of this protocol, please refer to Lal et al. (2021).


Asunto(s)
Neoplasias , Algoritmos , Humanos , Mutación , Neoplasias/diagnóstico
5.
iScience ; 25(6): 104487, 2022 Jun 17.
Artículo en Inglés | MEDLINE | ID: mdl-35677393

RESUMEN

A key task of genomic surveillance of infectious viral diseases lies in the early detection of dangerous variants. Unexpected help to this end is provided by the analysis of deep sequencing data of viral samples, which are typically discarded after creating consensus sequences. Such analysis allows one to detect intra-host low-frequency mutations, which are a footprint of mutational processes underlying the origination of new variants. Their timely identification may improve public-health decision-making with respect to traditional approaches exploiting consensus sequences. We present the analysis of 220,788 high-quality deep sequencing SARS-CoV-2 samples, showing that many spike and nucleocapsid mutations of interest associated to the most circulating variants, including Beta, Delta, and Omicron, might have been intercepted several months in advance. Furthermore, we show that a refined genomic surveillance system leveraging deep sequencing data might allow one to pinpoint emerging mutation patterns, providing an automated data-driven support to virologists and epidemiologists.

7.
Virus Evol ; 8(1): veac026, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35371557

RESUMEN

Many large national and transnational studies have been dedicated to the analysis of Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) genome, most of which focused on missense and nonsense mutations. However, approximately 30 per cent of the SARS-CoV-2 variants are synonymous, therefore changing the target codon without affecting the corresponding protein sequence. By performing a large-scale analysis of sequencing data generated from almost 400,000 SARS-CoV-2 samples, we show that silent mutations increasing the similarity of viral codons to the human ones tend to fixate in the viral genome overtime. This indicates that SARS-CoV-2 codon usage is adapting to the human host, likely improving its effectiveness in using the human aminoacyl-tRNA set through the accumulation of deceitfully neutral silent mutations. One-Sentence Summary. Synonymous SARS-CoV-2 mutations related to the activity of different mutational processes may positively impact viral evolution by increasing its adaptation to the human codon usage.

8.
Bioinformatics ; 38(3): 754-762, 2022 01 12.
Artículo en Inglés | MEDLINE | ID: mdl-34647978

RESUMEN

MOTIVATION: Driver (epi)genomic alterations underlie the positive selection of cancer subpopulations, which promotes drug resistance and relapse. Even though substantial heterogeneity is witnessed in most cancer types, mutation accumulation patterns can be regularly found and can be exploited to reconstruct predictive models of cancer evolution. Yet, available methods can not infer logical formulas connecting events to represent alternative evolutionary routes or convergent evolution. RESULTS: We introduce PMCE, an expressive framework that leverages mutational profiles from cross-sectional sequencing data to infer probabilistic graphical models of cancer evolution including arbitrary logical formulas, and which outperforms the state-of-the-art in terms of accuracy and robustness to noise, on simulations. The application of PMCE to 7866 samples from the TCGA database allows us to identify a highly significant correlation between the predicted evolutionary paths and the overall survival in 7 tumor types, proving that our approach can effectively stratify cancer patients in reliable risk groups. AVAILABILITY AND IMPLEMENTATION: PMCE is freely available at https://github.com/BIMIB-DISCo/PMCE, in addition to the code to replicate all the analyses presented in the manuscript. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Algoritmos , Neoplasias , Humanos , Pronóstico , Estudios Transversales , Neoplasias/genética , Genómica
9.
Viruses ; 15(1)2022 12 20.
Artículo en Inglés | MEDLINE | ID: mdl-36680048

RESUMEN

We present a large-scale analysis of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) substitutions, considering 1,585,456 high-quality raw sequencing samples, aimed at investigating the existence and quantifying the effect of mutational processes causing mutations in SARS-CoV-2 genomes when interacting with the human host. As a result, we confirmed the presence of three well-differentiated mutational processes likely ruled by reactive oxygen species (ROS), apolipoprotein B editing complex (APOBEC), and adenosine deaminase acting on RNA (ADAR). We then evaluated the activity of these mutational processes in different continental groups, showing that some samples from Africa present a significantly higher number of substitutions, most likely due to higher APOBEC activity. We finally analyzed the activity of mutational processes across different SARS-CoV-2 variants, and we found a significantly lower number of mutations attributable to APOBEC activity in samples assigned to the Omicron variant.


Asunto(s)
COVID-19 , SARS-CoV-2 , Humanos , SARS-CoV-2/genética , Mutación , África
10.
Curr Genomics ; 22(2): 88-97, 2021 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-34220296

RESUMEN

BACKGROUND: The increasing availability of omics data collected from patients affected by severe pathologies, such as cancer, is fostering the development of data science methods for their analysis. INTRODUCTION: The combination of data integration and machine learning approaches can provide new powerful instruments to tackle the complexity of cancer development and deliver effective diagnostic and prognostic strategies. METHODS: We explore the possibility of exploiting the topological properties of sample-specific metabolic networks as features in a supervised classification task. Such networks are obtained by projecting transcriptomic data from RNA-seq experiments on genome-wide metabolic models to define weighted networks modeling the overall metabolic activity of a given sample. RESULTS: We show the classification results on a labeled breast cancer dataset from the TCGA database, including 210 samples (cancer vs. normal). In particular, we investigate how the performance is affected by a threshold-based pruning of the networks by comparing Artificial Neural Networks, Support Vector Machines and Random Forests. Interestingly, the best classification performance is achieved within a small threshold range for all methods, suggesting that it might represent an effective choice to recover useful information while filtering out noise from data. Overall, the best accuracy is achieved with SVMs, which exhibit performances similar to those obtained when gene expression profiles are used as features. CONCLUSION: These findings demonstrate that the topological properties of sample-specific metabolic networks are effective in classifying cancer and normal samples, suggesting that useful information can be extracted from a relatively limited number of features.

11.
Patterns (N Y) ; 2(3): 100212, 2021 Mar 12.
Artículo en Inglés | MEDLINE | ID: mdl-33728416

RESUMEN

We introduce VERSO, a two-step framework for the characterization of viral evolution from sequencing data of viral genomes, which is an improvement on phylogenomic approaches for consensus sequences. VERSO exploits an efficient algorithmic strategy to return robust phylogenies from clonal variant profiles, also in conditions of sampling limitations. It then leverages variant frequency patterns to characterize the intra-host genomic diversity of samples, revealing undetected infection chains and pinpointing variants likely involved in homoplasies. On simulations, VERSO outperforms state-of-the-art tools for phylogenetic inference. Notably, the application to 6,726 amplicon and RNA sequencing samples refines the estimation of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) evolution, while co-occurrence patterns of minor variants unveil undetected infection paths, which are validated with contact tracing data. Finally, the analysis of SARS-CoV-2 mutational landscape uncovers a temporal increase of overall genomic diversity and highlights variants transiting from minor to clonal state and homoplastic variants, some of which fall on the spike gene. Available at: https://github.com/BIMIB-DISCo/VERSO.

12.
iScience ; 24(2): 102116, 2021 Feb 19.
Artículo en Inglés | MEDLINE | ID: mdl-33532709

RESUMEN

To dissect the mechanisms underlying the inflation of variants in the Severe Acute Respiratory Syndrome CoronaVirus 2 (SARS-CoV-2) genome, we present a large-scale analysis of intra-host genomic diversity, which reveals that most samples exhibit heterogeneous genomic architectures, due to the interplay between host-related mutational processes and transmission dynamics. The decomposition of minor variants profiles unveils three non-overlapping mutational signatures related to nucleotide substitutions and likely ruled by APOlipoprotein B Editing Complex (APOBEC), Reactive Oxygen Species (ROS), and Adenosine Deaminase Acting on RNA (ADAR), highlighting heterogeneous host responses to SARS-CoV-2 infections. A corrected-for-signatures dN/dS analysis demonstrates that such mutational processes are affected by purifying selection, with important exceptions. In fact, several mutations appear to transit toward clonality, defining new clonal genotypes that increase the overall genomic diversity. Furthermore, the phylogenomic analysis shows the presence of homoplasies and supports the hypothesis of transmission of minor variants. This study paves the way for the integrated analysis of intra-host genomic diversity and clinical outcomes of SARS-CoV-2 infections.

13.
Brief Bioinform ; 22(4)2021 07 20.
Artículo en Inglés | MEDLINE | ID: mdl-33003202

RESUMEN

MOTIVATION: The advancements of single-cell sequencing methods have paved the way for the characterization of cellular states at unprecedented resolution, revolutionizing the investigation on complex biological systems. Yet, single-cell sequencing experiments are hindered by several technical issues, which cause output data to be noisy, impacting the reliability of downstream analyses. Therefore, a growing number of data science methods has been proposed to recover lost or corrupted information from single-cell sequencing data. To date, however, no quantitative benchmarks have been proposed to evaluate such methods. RESULTS: We present a comprehensive analysis of the state-of-the-art computational approaches for denoising and imputation of single-cell transcriptomic data, comparing their performance in different experimental scenarios. In detail, we compared 19 denoising and imputation methods, on both simulated and real-world datasets, with respect to several performance metrics related to imputation of dropout events, recovery of true expression profiles, characterization of cell similarity, identification of differentially expressed genes and computation time. The effectiveness and scalability of all methods were assessed with regard to distinct sequencing protocols, sample size and different levels of biological variability and technical noise. As a result, we identify a subset of versatile approaches exhibiting solid performances on most tests and show that certain algorithmic families prove effective on specific tasks but inefficient on others. Finally, most methods appear to benefit from the introduction of appropriate assumptions on noise distribution of biological processes.


Asunto(s)
Perfilación de la Expresión Génica , RNA-Seq , Análisis de la Célula Individual , Programas Informáticos , Animales , Humanos
14.
Artículo en Inglés | MEDLINE | ID: mdl-32548108

RESUMEN

One of the key challenges in current cancer research is the development of computational strategies to support clinicians in the identification of successful personalized treatments. Control theory might be an effective approach to this end, as proven by the long-established application to therapy design and testing. In this respect, we here introduce the Control Theory for Therapy Design (CT4TD) framework, which employs optimal control theory on patient-specific pharmacokinetics (PK) and pharmacodynamics (PD) models, to deliver optimized therapeutic strategies. The definition of personalized PK/PD models allows to explicitly consider the physiological heterogeneity of individuals and to adapt the therapy accordingly, as opposed to standard clinical practices. CT4TD can be used in two distinct scenarios. At the time of the diagnosis, CT4TD allows to set optimized personalized administration strategies, aimed at reaching selected target drug concentrations, while minimizing the costs in terms of toxicity and adverse effects. Moreover, if longitudinal data on patients under treatment are available, our approach allows to adjust the ongoing therapy, by relying on simplified models of cancer population dynamics, with the goal of minimizing or controlling the tumor burden. CT4TD is highly scalable, as it employs the efficient dCRAB/RedCRAB optimization algorithm, and the results are robust, as proven by extensive tests on synthetic data. Furthermore, the theoretical framework is general, and it might be applied to any therapy for which a PK/PD model can be estimated, and for any kind of administration and cost. As a proof of principle, we present the application of CT4TD to Imatinib administration in Chronic Myeloid leukemia, in which we adopt a simplified model of cancer population dynamics. In particular, we show that the optimized therapeutic strategies are diversified among patients, and display improvements with respect to the current standard regime.

15.
Cell Death Dis ; 9(3): 349, 2018 03 02.
Artículo en Inglés | MEDLINE | ID: mdl-29500381

RESUMEN

Chronic Myeloid Leukemia (CML) is a stem cell cancer that arises when t(9;22) translocation occurs in a hematopoietic stem cells. This event results in the expression of the BCR-ABL1 fusion gene, which codes for a constitutively active tyrosine kinase that is responsible for the transformation of a HSC into a CML stem cell, which then gives rise to a clonal myeloproliferative disease. The introduction of Tyrosine Kinase Inhibitors (TKIs) has revolutionized the management of the disease. However, these drugs do not seem to be able to eradicate the malignancy. Indeed, discontinuation trials (STIM; TWISER; DADI) for those patients who achieved a profound molecular response showed 50% relapsing within 12 months. We performed a comparative analysis on 15 CML patients and one B-ALL patient, between the standard quantitative reverse-transcriptase PCR (qRT-PCR) and our genomic DNA patient-specific quantitative PCR assay (gDNA qPCR). Here we demonstrate that gDNA qPCR is better than standard qRT-PCR in disease monitoring after an average follow-up period of 200 days. Specifically, we statistically demonstrated that DNA negativity is more reliable than RNA negativity in indicating when TKIs therapy can be safely stopped.


Asunto(s)
Proteínas de Fusión bcr-abl/genética , Leucemia Mielógena Crónica BCR-ABL Positiva/patología , Células Madre Neoplásicas/metabolismo , Leucemia-Linfoma Linfoblástico de Células Precursoras B/patología , Reacción en Cadena en Tiempo Real de la Polimerasa/métodos , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa/métodos , Anciano , ADN/genética , Femenino , Estudios de Seguimiento , Humanos , Mesilato de Imatinib/farmacología , Mesilato de Imatinib/uso terapéutico , Leucemia Mielógena Crónica BCR-ABL Positiva/tratamiento farmacológico , Masculino , Persona de Mediana Edad , Leucemia-Linfoma Linfoblástico de Células Precursoras B/tratamiento farmacológico , Inhibidores de Proteínas Quinasas/farmacología , Inhibidores de Proteínas Quinasas/uso terapéutico , Proteínas Tirosina Quinasas/antagonistas & inhibidores , ARN Mensajero/genética , Reproducibilidad de los Resultados , Resultado del Tratamiento , Adulto Joven
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA