RESUMEN
MOTIVATION: Understanding the genomic heterogeneity of tumors is an important task in computational oncology, especially in the context of finding personalized treatments based on the genetic profile of each patient's tumor. Tumor clustering that takes into account the temporal order of genetic events, as represented by tumor mutation trees, is a powerful approach for grouping together patients with genetically and evolutionarily similar tumors and can provide insights into discovering tumor subtypes, for more accurate clinical diagnosis and prognosis. RESULTS: Here, we propose oncotree2vec, a method for clustering tumor mutation trees by learning vector representations of mutation trees that capture the different relationships between subclones in an unsupervised manner. Learning low-dimensional tree embeddings facilitates the visualization of relations between trees in large cohorts and can be used for downstream analyses, such as deep learning approaches for single-cell multi-omics data integration. We assessed the performance and the usefulness of our method in three simulation studies and on two real datasets: a cohort of 43 trees from six cancer types with different branching patterns corresponding to different modes of spatial tumor evolution and a cohort of 123 AML mutation trees. AVAILABILITY AND IMPLEMENTATION: https://github.com/cbg-ethz/oncotree2vec.
Asunto(s)
Mutación , Neoplasias , Humanos , Neoplasias/genética , Análisis por Conglomerados , Algoritmos , Biología Computacional/métodos , Programas InformáticosRESUMEN
MOTIVATION: Metastasis formation is a hallmark of cancer lethality. Yet, metastases are generally unobservable during their early stages of dissemination and spread to distant organs. Genomic datasets of matched primary tumors and metastases may offer insights into the underpinnings and the dynamics of metastasis formation. RESULTS: We present metMHN, a cancer progression model designed to deduce the joint progression of primary tumors and metastases using cross-sectional cancer genomics data. The model elucidates the statistical dependencies among genomic events, the formation of metastasis, and the clinical emergence of both primary tumors and their metastatic counterparts. metMHN enables the chronological reconstruction of mutational sequences and facilitates estimation of the timing of metastatic seeding. In a study of nearly 5000 lung adenocarcinomas, metMHN pinpointed TP53 and EGFR as mediators of metastasis formation. Furthermore, the study revealed that post-seeding adaptation is predominantly influenced by frequent copy number alterations. AVAILABILITY AND IMPLEMENTATION: All datasets and code are available on GitHub at https://github.com/cbg-ethz/metMHN.
Asunto(s)
Genómica , Metástasis de la Neoplasia , Humanos , Genómica/métodos , Metástasis de la Neoplasia/genética , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/patología , Progresión de la Enfermedad , Neoplasias/genética , Neoplasias/patología , Adenocarcinoma del Pulmón/genética , Adenocarcinoma del Pulmón/patología , Mutación , Proteína p53 Supresora de Tumor/genética , Proteína p53 Supresora de Tumor/metabolismo , Estudios Transversales , Receptores ErbB/genéticaRESUMEN
A better understanding of the features that define the interaction between cancer cells and immune cells is important for the development of new cancer therapies1. However, focus is often given to interactions that occur within the primary tumour and its microenvironment, whereas the role of immune cells during cancer dissemination in patients remains largely uncharacterized2,3. Circulating tumour cells (CTCs) are precursors of metastasis in several types of cancer4-6, and are occasionally found within the bloodstream in association with non-malignant cells such as white blood cells (WBCs)7,8. The identity and function of these CTC-associated WBCs, as well as the molecular features that define the interaction between WBCs and CTCs, are unknown. Here we isolate and characterize individual CTC-associated WBCs, as well as corresponding cancer cells within each CTC-WBC cluster, from patients with breast cancer and from mouse models. We use single-cell RNA sequencing to show that in the majority of these cases, CTCs were associated with neutrophils. When comparing the transcriptome profiles of CTCs associated with neutrophils against those of CTCs alone, we detect a number of differentially expressed genes that outline cell cycle progression, leading to more efficient metastasis formation. Further, we identify cell-cell junction and cytokine-receptor pairs that define CTC-neutrophil clusters, representing key vulnerabilities of the metastatic process. Thus, the association between neutrophils and CTCs drives cell cycle progression within the bloodstream and expands the metastatic potential of CTCs, providing a rationale for targeting this interaction in treatment of breast cancer.
Asunto(s)
Neoplasias de la Mama/patología , Ciclo Celular , Metástasis de la Neoplasia/patología , Células Neoplásicas Circulantes/patología , Neutrófilos/patología , Animales , Neoplasias de la Mama/terapia , Ciclo Celular/genética , Línea Celular Tumoral , Proliferación Celular , Exones/genética , Femenino , Perfilación de la Expresión Génica , Humanos , Uniones Intercelulares , Ratones , Mutación/genética , Metástasis de la Neoplasia/genética , Células Neoplásicas Circulantes/metabolismo , Neutrófilos/metabolismo , Análisis de Secuencia de ARN , Secuenciación del ExomaRESUMEN
Dynamic Bayesian networks (DBNs) can be used for the discovery of gene regulatory networks (GRNs) from time series gene expression data. Here, we suggest a strategy for learning DBNs from gene expression data by employing a Bayesian approach that is scalable to large networks and is targeted at learning models with high predictive accuracy. Our framework can be used to learn DBNs for multiple groups of samples and highlight differences and similarities in their GRNs. We learn these DBN models based on different structural and parametric assumptions and select the optimal model based on the cross-validated predictive accuracy. We show in simulation studies that our approach is better equipped to prevent overfitting than techniques used in previous studies. We applied the proposed DBN-based approach to two time series transcriptomic datasets from the Gene Expression Omnibus database, each comprising data from distinct phenotypic groups of the same tissue type. In the first case, we used DBNs to characterize responders and non-responders to anti-cancer therapy. In the second case, we compared normal to tumor cells of colorectal tissue. The classification accuracy reached by the DBN-based classifier for both datasets was higher than reported previously. For the colorectal cancer dataset, our analysis suggested that GRNs for cancer and normal tissues have a lot of differences, which are most pronounced in the neighborhoods of oncogenes and known cancer tissue markers. The identified differences in gene networks of cancer and normal cells may be used for the discovery of targeted therapies.
Asunto(s)
Biología Computacional , Redes Reguladoras de Genes , Algoritmos , Teorema de Bayes , Biología Computacional/métodos , Simulación por ComputadorRESUMEN
MOTIVATION: Gene set enrichment methods are a common tool to improve the interpretability of gene lists as obtained, for example, from differential gene expression analyses. They are based on computing whether dysregulated genes are located in certain biological pathways more often than expected by chance. Gene set enrichment tools rely on pre-existing pathway databases such as KEGG, Reactome, or the Gene Ontology. These databases are increasing in size and in the number of redundancies between pathways, which complicates the statistical enrichment computation. RESULTS: We address this problem and develop a novel gene set enrichment method, called pareg, which is based on a regularized generalized linear model and directly incorporates dependencies between gene sets related to certain biological functions, for example, due to shared genes, in the enrichment computation. We show that pareg is more robust to noise than competing methods. Additionally, we demonstrate the ability of our method to recover known pathways as well as to suggest novel treatment targets in an exploratory analysis using breast cancer samples from TCGA. AVAILABILITY AND IMPLEMENTATION: pareg is freely available as an R package on Bioconductor (https://bioconductor.org/packages/release/bioc/html/pareg.html) as well as on https://github.com/cbg-ethz/pareg. The GitHub repository also contains the Snakemake workflows needed to reproduce all results presented here.
Asunto(s)
Bases de Datos Factuales , Ontología de Genes , Modelos Lineales , Flujo de TrabajoRESUMEN
BACKGROUND: Liquid biopsy is a minimally-invasive method of sampling bodily fluids, capable of revealing evidence of cancer. The distribution of cell-free DNA (cfDNA) fragment lengths has been shown to differ between healthy subjects and cancer patients, whereby the distributional shift correlates with the sample's tumour content. These fragmentomic data have not yet been utilised to directly quantify the proportion of tumour-derived cfDNA in a liquid biopsy. RESULTS: We used statistical learning to predict tumour content from Fourier and wavelet transforms of cfDNA length distributions in samples from 118 cancer patients. The model was validated on an independent dilution series of patient plasma. CONCLUSIONS: This proof of concept suggests that our fragmentomic methodology could be useful for predicting tumour content in liquid biopsies.
Asunto(s)
Ácidos Nucleicos Libres de Células , Neoplasias , Humanos , Ácidos Nucleicos Libres de Células/genética , Neoplasias/genética , Neoplasias/patología , Biopsia Líquida/métodos , ADN , Biomarcadores de Tumor/genéticaRESUMEN
Approximately 70% of clear cell renal cell carcinoma (ccRCC) is characterized by the biallelic inactivation of von Hippel-Lindau (VHL) on chromosome 3p. ELOC-mutated (Elongin C-mutated) renal cell carcinoma containing biallelic ELOC inactivations with chromosome 8q deletions is considered a novel subtype of renal cancer possessing a morphologic overlap with ccRCC, renal cell carcinoma (RCC) with fibromyomatous stroma exhibiting Tuberous Sclerosis Complex (TSC)/mammalian Target of Rapamycin (mTOR) mutations, and clear cell papillary tumor. However, the frequency and consequences of ELOC alterations in wild-type VHL and mutated VHL RCC are unclear. In this study, we characterize 123 renal tumors with clear cell morphology and known VHL mutation status to assess the morphologic and molecular consequences of ELOC inactivation. Using OncoScan and whole-exome sequencing, we identify 18 ELOC-deleted RCCs, 3 of which contain ELOC mutations resulting in the biallelic inactivation of ELOC. Biallelic ELOC and biallelic VHL aberrations were mutually exclusive; however, 2 ELOC-mutated RCCs showed monoallelic VHL alterations. Furthermore, no mutations in TSC1, TSC2, or mTOR were identified in ELOC-mutated RCC with biallelic ELOC inactivation. Using High Ambiguity Driven biomolecular DOCKing, we report a novel ELOC variant containing a duplication event disrupting ELOC-VHL interaction alongside the frequently seen Y79C alteration. Using hyper reaction monitoring mass spectrometry, we show RCCs with biallelic ELOC alterations have significantly reduced ELOC expression but similar carbonic anhydrase 9 and vascular endothelial growth factor A expression compared with classical ccRCC with biallelic VHL inactivation. The absence of biallelic VHL and TSC1, TSC2, or mTOR inactivation in RCC with biallelic ELOC inactivation (ELOC mutation in combination with ELOC deletions on chromosome 8q) supports the notion of a novel, molecularly defined tumor entity.
Asunto(s)
Carcinoma de Células Renales , Neoplasias Renales , Humanos , Carcinoma de Células Renales/patología , Factor A de Crecimiento Endotelial Vascular , Elonguina/genética , Proteína Supresora de Tumores del Síndrome de Von Hippel-Lindau/genética , Neoplasias Renales/genética , Neoplasias Renales/patología , Serina-Treonina Quinasas TORRESUMEN
SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) is a novel virus of the family Coronaviridae. The virus causes the infectious disease COVID-19. The biology of coronaviruses has been studied for many years. However, bioinformatics tools designed explicitly for SARS-CoV-2 have only recently been developed as a rapid reaction to the need for fast detection, understanding and treatment of COVID-19. To control the ongoing COVID-19 pandemic, it is of utmost importance to get insight into the evolution and pathogenesis of the virus. In this review, we cover bioinformatics workflows and tools for the routine detection of SARS-CoV-2 infection, the reliable analysis of sequencing data, the tracking of the COVID-19 pandemic and evaluation of containment measures, the study of coronavirus evolution, the discovery of potential drug targets and development of therapeutic strategies. For each tool, we briefly describe its use case and how it advances research specifically for SARS-CoV-2. All tools are free to use and available online, either through web applications or public code repositories. Contact:evbc@unj-jena.de.
Asunto(s)
COVID-19/prevención & control , Biología Computacional , SARS-CoV-2/aislamiento & purificación , Investigación Biomédica , COVID-19/epidemiología , COVID-19/virología , Genoma Viral , Humanos , Pandemias , SARS-CoV-2/genéticaRESUMEN
MOTIVATION: Tumours evolve as heterogeneous populations of cells, which may be distinguished by different genomic aberrations. The resulting intra-tumour heterogeneity plays an important role in cancer patient relapse and treatment failure, so that obtaining a clear understanding of each patient's tumour composition and evolutionary history is key for personalized therapies. Single-cell sequencing (SCS) now provides the possibility to resolve tumour heterogeneity at the highest resolution of individual tumour cells, but brings with it challenges related to the particular noise profiles of the sequencing protocols as well as the complexity of the underlying evolutionary process. RESULTS: By modelling the noise processes and allowing mutations to be lost or to reoccur during tumour evolution, we present a method to jointly call mutations in each cell, reconstruct the phylogenetic relationship between cells, and determine the locations of mutational losses and recurrences. Our Bayesian approach allows us to accurately call mutations as well as to quantify our certainty in such predictions. We show the advantages of allowing mutational loss or recurrence with simulated data and present its application to tumour SCS data. AVAILABILITY AND IMPLEMENTATION: SCIΦN is available at https://github.com/cbg-ethz/SCIPhIN. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Genómica , Neoplasias , Teorema de Bayes , Humanos , Mutación , Neoplasias/genética , Filogenia , Programas InformáticosRESUMEN
MOTIVATION: Signaling pathways control cellular behavior. Dysregulated pathways, for example, due to mutations that cause genes and proteins to be expressed abnormally, can lead to diseases, such as cancer. RESULTS: We introduce a novel computational approach, called Differential Causal Effects (dce), which compares normal to cancerous cells using the statistical framework of causality. The method allows to detect individual edges in a signaling pathway that are dysregulated in cancer cells, while accounting for confounding. Hence, technical artifacts have less influence on the results and dce is more likely to detect the true biological signals. We extend the approach to handle unobserved dense confounding, where each latent variable, such as, for example, batch effects or cell cycle states, affects many covariates. We show that dce outperforms competing methods on synthetic datasets and on CRISPR knockout screens. We validate its latent confounding adjustment properties on a GTEx (Genotype-Tissue Expression) dataset. Finally, in an exploratory analysis on breast cancer data from TCGA (The Cancer Genome Atlas), we recover known and discover new genes involved in breast cancer progression. AVAILABILITY AND IMPLEMENTATION: The method dce is freely available as an R package on Bioconductor (https://bioconductor.org/packages/release/bioc/html/dce.html) as well as on https://github.com/cbg-ethz/dce. The GitHub repository also contains the Snakemake workflows needed to reproduce all results presented here. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Neoplasias de la Mama , Programas Informáticos , Humanos , Femenino , Genoma , Transducción de SeñalRESUMEN
Comprehensive molecular characterization of cancer subtypes is essential for predicting clinical outcomes and searching for personalized treatments. We present bnClustOmics, a statistical model and computational tool for multi-omics unsupervised clustering, which serves a dual purpose: Clustering patient samples based on a Bayesian network mixture model and learning the networks of omics variables representing these clusters. The discovered networks encode interactions among all omics variables and provide a molecular characterization of each patient subgroup. We conducted simulation studies that demonstrated the advantages of our approach compared to other clustering methods in the case where the generative model is a mixture of Bayesian networks. We applied bnClustOmics to a hepatocellular carcinoma (HCC) dataset comprising genome (mutation and copy number), transcriptome, proteome, and phosphoproteome data. We identified three main HCC subtypes together with molecular characteristics, some of which are associated with survival even when adjusting for the clinical stage. Cluster-specific networks shed light on the links between genotypes and molecular phenotypes of samples within their respective clusters and suggest targets for personalized treatments.
Asunto(s)
Carcinoma Hepatocelular , Neoplasias Hepáticas , Teorema de Bayes , Carcinoma Hepatocelular/genética , Análisis por Conglomerados , Humanos , Neoplasias Hepáticas/genética , Proteoma , TranscriptomaRESUMEN
Single-cell RNA sequencing (scRNA-seq) has emerged as a powerful technique to decipher tissue composition at the single-cell level and to inform on disease mechanisms, tumor heterogeneity, and the state of the immune microenvironment. Although multiple methods for the computational analysis of scRNA-seq data exist, their application in a clinical setting demands standardized and reproducible workflows, targeted to extract, condense, and display the clinically relevant information. To this end, we designed scAmpi (Single Cell Analysis mRNA pipeline), a workflow that facilitates scRNA-seq analysis from raw read processing to informing on sample composition, clinically relevant gene and pathway alterations, and in silico identification of personalized candidate drug treatments. We demonstrate the value of this workflow for clinical decision making in a molecular tumor board as part of a clinical study.
Asunto(s)
Análisis de la Célula Individual , Programas Informáticos , Perfilación de la Expresión Génica/métodos , Análisis de Secuencia de ARN/métodos , Análisis de la Célula Individual/métodos , Secuenciación del Exoma , Flujo de TrabajoRESUMEN
Systematic perturbation screens provide comprehensive resources for the elucidation of cancer driver genes. The perturbation of many genes in relatively few cell lines in such functional screens necessitates the development of specialized computational tools with sufficient statistical power. Here we developed APSiC (Analysis of Perturbation Screens for identifying novel Cancer genes) to identify genetic drivers and effectors in perturbation screens even with few samples. Applying APSiC to the shRNA screen Project DRIVE, APSiC identified well-known and novel putative mutational and amplified cancer genes across all cancer types and in specific cancer types. Additionally, APSiC discovered tumor-promoting and tumor-suppressive effectors, respectively, for individual cancer types, including genes involved in cell cycle control, Wnt/ß-catenin and hippo signalling pathways. We functionally demonstrated that LRRC4B, a putative novel tumor-suppressive effector, suppresses proliferation by delaying cell cycle and modulates apoptosis in breast cancer. We demonstrate APSiC is a robust statistical framework for discovery of novel cancer genes through analysis of large-scale perturbation screens. The analysis of DRIVE using APSiC is provided as a web portal and represents a valuable resource for the discovery of novel cancer genes.
Asunto(s)
Transformación Celular Neoplásica/genética , Genes Relacionados con las Neoplasias/genética , Genómica , Neoplasias/genética , Apoptosis/genética , Línea Celular Tumoral , Amplificación de Genes/genética , Humanos , Neoplasias/patología , ARN Interferente Pequeño/genética , Transducción de Señal/genéticaRESUMEN
MOTIVATION: Cancer is one of the most prevalent diseases in the world. Tumors arise due to important genes changing their activity, e.g. when inhibited or over-expressed. But these gene perturbations are difficult to observe directly. Molecular profiles of tumors can provide indirect evidence of gene perturbations. However, inferring perturbation profiles from molecular alterations is challenging due to error-prone molecular measurements and incomplete coverage of all possible molecular causes of gene perturbations. RESULTS: We have developed a novel mathematical method to analyze cancer driver genes and their patient-specific perturbation profiles. We combine genetic aberrations with gene expression data in a causal network derived across patients to infer unobserved perturbations. We show that our method can predict perturbations in simulations, CRISPR perturbation screens and breast cancer samples from The Cancer Genome Atlas. AVAILABILITY AND IMPLEMENTATION: The method is available as the R-package nempi at https://github.com/cbg-ethz/nempi and http://bioconductor.org/packages/nempi. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
RESUMEN
MOTIVATION: High-throughput sequencing technologies are used increasingly not only in viral genomics research but also in clinical surveillance and diagnostics. These technologies facilitate the assessment of the genetic diversity in intra-host virus populations, which affects transmission, virulence and pathogenesis of viral infections. However, there are two major challenges in analysing viral diversity. First, amplification and sequencing errors confound the identification of true biological variants, and second, the large data volumes represent computational limitations. RESULTS: To support viral high-throughput sequencing studies, we developed V-pipe, a bioinformatics pipeline combining various state-of-the-art statistical models and computational tools for automated end-to-end analyses of raw sequencing reads. V-pipe supports quality control, read mapping and alignment, low-frequency mutation calling, and inference of viral haplotypes. For generating high-quality read alignments, we developed a novel method, called ngshmmalign, based on profile hidden Markov models and tailored to small and highly diverse viral genomes. V-pipe also includes benchmarking functionality providing a standardized environment for comparative evaluations of different pipeline configurations. We demonstrate this capability by assessing the impact of three different read aligners (Bowtie 2, BWA MEM, ngshmmalign) and two different variant callers (LoFreq, ShoRAH) on the performance of calling single-nucleotide variants in intra-host virus populations. V-pipe supports various pipeline configurations and is implemented in a modular fashion to facilitate adaptations to the continuously changing technology landscape. AVAILABILITYAND IMPLEMENTATION: V-pipe is freely available at https://github.com/cbg-ethz/V-pipe. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
RESUMEN
CRISPR-based systems have fundamentally transformed our ability to study and manipulate stem cells. We explored the possibility of using catalytically dead Cas9 (dCas9) from S. pyogenes as a platform for targeted epigenetic editing in stem cells to enhance the expression of the eomesodermin gene (EOMES) during differentiation. We observed, however, that the dCas9 protein itself exerts a potential non-specific effect in hiPSCs, affecting the cell's phenotype and gene expression patterns during subsequent directed differentiation. We show that this effect is specific to the condition when cells are cultured in medium that does not actively maintain the pluripotency network, and that the sgRNA-free apo-dCas9 protein itself influences endogenous gene expression. Transcriptomics analysis revealed that a significant number of genes involved in developmental processes and various other genes with non-overlapping biological functions are affected by dCas9 overexpression. This suggests a potential adverse phenotypic effect of dCas9 itself in hiPSCs, which could have implications for when and how CRISPR/Cas9-based tools can be used reliably and safely in pluripotent stem cells.
Asunto(s)
Sistemas CRISPR-Cas , Células Madre Pluripotentes Inducidas , Expresión Génica , Humanos , Línea PrimitivaRESUMEN
Clonal hematopoiesis (CH) is associated with age and an increased risk of myeloid malignancies, cardiovascular risk, and all-cause mortality. We tested for CH in a setting where hematopoietic stem cells (HSCs) of the same individual are exposed to different degrees of proliferative stress and environments, ie, in long-term survivors of allogeneic hematopoietic stem cell transplantation (allo-HSCT) and their respective related donors (n = 42 donor-recipient pairs). With a median follow-up time since allo-HSCT of 16 years (range, 10-32 years), we found a total of 35 mutations in 23 out of 84 (27.4%) study participants. Ten out of 42 donors (23.8%) and 13 out of 42 recipients (31%) had CH. CH was associated with older donor and recipient age. We identified 5 cases of donor-engrafted CH, with 1 case progressing into myelodysplastic syndrome in both donor and recipient. Four out of 5 cases showed increased clone size in recipients compared with donors. We further characterized the hematopoietic system in individuals with CH as follows: (1) CH was consistently present in myeloid cells but varied in penetrance in B and T cells; (2) colony-forming units (CFUs) revealed clonal evolution or multiple independent clones in individuals with multiple CH mutations; and (3) telomere shortening determined in granulocytes suggested â¼20 years of added proliferative history of HSCs in recipients compared with their donors, with telomere length in CH vs non-CH CFUs showing varying patterns. This study provides insight into the long-term behavior of the same human HSCs and respective CH development under different proliferative conditions.
Asunto(s)
Hematopoyesis Clonal , Trasplante de Células Madre Hematopoyéticas/mortalidad , Células Madre Hematopoyéticas/metabolismo , Donantes de Tejidos , Adolescente , Adulto , Anciano , Anciano de 80 o más Años , Alelos , Evolución Clonal/genética , Ensayo de Unidades Formadoras de Colonias , Análisis Mutacional de ADN , Femenino , Células Madre Hematopoyéticas/citología , Humanos , Masculino , Persona de Mediana Edad , Mutación , Pronóstico , Telómero , Receptores de Trasplantes , Trasplante Homólogo , Resultado del Tratamiento , Adulto JovenRESUMEN
Increasing body of experimental evidence suggests that anticancer and antimicrobial therapies may themselves promote the acquisition of drug resistance by increasing mutability. The successful control of evolving populations requires that such biological costs of control are identified, quantified and included to the evolutionarily informed treatment protocol. Here we identify, characterise and exploit a trade-off between decreasing the target population size and generating a surplus of treatment-induced rescue mutations. We show that the probability of cure is maximized at an intermediate dosage, below the drug concentration yielding maximal population decay, suggesting that treatment outcomes may in some cases be substantially improved by less aggressive treatment strategies. We also provide a general analytical relationship that implicitly links growth rate, pharmacodynamics and dose-dependent mutation rate to an optimal control law. Our results highlight the important, but often neglected, role of fundamental eco-evolutionary costs of control. These costs can often lead to situations, where decreasing the cumulative drug dosage may be preferable even when the objective of the treatment is elimination, and not containment. Taken together, our results thus add to the ongoing criticism of the standard practice of administering aggressive, high-dose therapies and motivate further experimental and clinical investigation of the mutagenicity and other hidden collateral costs of therapies.
Asunto(s)
Farmacorresistencia Microbiana/genética , Resistencia a Antineoplásicos/genética , Antiinfecciosos/administración & dosificación , Antineoplásicos/administración & dosificación , Biología Computacional , Simulación por Computador , Relación Dosis-Respuesta a Droga , Evolución Molecular , Interacciones Microbiota-Huesped/efectos de los fármacos , Interacciones Microbiota-Huesped/genética , Humanos , Modelos Biológicos , Mutación/efectos de los fármacos , Tasa de Mutación , Neoplasias/tratamiento farmacológico , Neoplasias/genética , Fenotipo , Procesos EstocásticosRESUMEN
Although combination antiretroviral therapies seem to be effective at controlling HIV-1 infections regardless of the viral subtype, there is increasing evidence for subtype-specific drug resistance mutations. The order and rates at which resistance mutations accumulate in different subtypes also remain poorly understood. Most of this knowledge is derived from studies of subtype B genotypes, despite not being the most abundant subtype worldwide. Here, we present a methodology for the comparison of mutational networks in different HIV-1 subtypes, based on Hidden Conjunctive Bayesian Networks (H-CBN), a probabilistic model for inferring mutational networks from cross-sectional genotype data. We introduce a Monte Carlo sampling scheme for learning H-CBN models for a larger number of resistance mutations and develop a statistical test to assess differences in the inferred mutational networks between two groups. We apply this method to infer the temporal progression of mutations conferring resistance to the protease inhibitor lopinavir in a large cross-sectional cohort of HIV-1 subtype C genotypes from South Africa, as well as to a data set of subtype B genotypes obtained from the Stanford HIV Drug Resistance Database and the Swiss HIV Cohort Study. We find strong support for different initial mutational events in the protease, namely at residue 46 in subtype B and at residue 82 in subtype C. The inferred mutational networks for subtype B versus C are significantly different sharing only five constraints on the order of accumulating mutations with mutation at residue 54 as the parental event. The results also suggest that mutations can accumulate along various alternative paths within subtypes, as opposed to a unique total temporal ordering. Beyond HIV drug resistance, the statistical methodology is applicable more generally for the comparison of inferred mutational networks between any two groups.
Asunto(s)
Farmacorresistencia Viral/genética , Inhibidores de la Proteasa del VIH/farmacología , VIH-1/efectos de los fármacos , Lopinavir/farmacología , Mutación , Teorema de Bayes , Estudios de Cohortes , Infecciones por VIH/virología , VIH-1/clasificación , HumanosRESUMEN
Tumour progression is an evolutionary process in which different clones evolve over time, leading to intra-tumour heterogeneity. Interactions between clones can affect tumour evolution and hence disease progression and treatment outcome. Intra-tumoural pairs of mutations that are overrepresented in a co-occurring or clonally exclusive fashion over a cohort of patient samples may be suggestive of a synergistic effect between the different clones carrying these mutations. We therefore developed a novel statistical testing framework, called GeneAccord, to identify such gene pairs that are altered in distinct subclones of the same tumour. We analysed our framework for calibration and power. By comparing its performance to baseline methods, we demonstrate that to control type I errors, it is essential to account for the evolutionary dependencies among clones. In applying GeneAccord to the single-cell sequencing of a cohort of 123 acute myeloid leukaemia patients, we find 1 clonally co-occurring and 8 clonally exclusive gene pairs. The clonally exclusive pairs mostly involve genes of the key signalling pathways.