RESUMO
Motivation: Proteomics profiling is increasingly being used for molecular stratification of cancer patients and cell-line panels. However, systematic assessment of the predictive power of large-scale proteomic technologies across various drug classes and cancer types is currently lacking. To that end, we carried out the first pan-cancer, multi-omics comparative analysis of the relative performance of two proteomic technologies, targeted reverse phase protein array (RPPA) and global mass spectrometry (MS), in terms of their accuracy for predicting the sensitivity of cancer cells to both cytotoxic chemotherapeutics and molecularly targeted anticancer compounds. Results: Our results in two cell-line panels demonstrate how MS profiling improves drug response predictions beyond that of the RPPA or the other omics profiles when used alone. However, frequent missing MS data values complicate its use in predictive modeling and required additional filtering, such as focusing on completely measured or known oncoproteins, to obtain maximal predictive performance. Rather strikingly, the two proteomics profiles provided complementary predictive signal both for the cytotoxic and targeted compounds. Further, information about the cellular-abundance of primary target proteins was found critical for predicting the response of targeted compounds, although the non-target features also contributed significantly to the predictive power. The clinical relevance of the selected protein markers was confirmed in cancer patient data. These results provide novel insights into the relative performance and optimal use of the widely applied proteomic technologies, MS and RPPA, which should prove useful in translational applications, such as defining the best combination of omics technologies and marker panels for understanding and predicting drug sensitivities in cancer patients. Availability and implementation: Processed datasets, R as well as Matlab implementations of the methods are available at https://github.com/mehr-een/bemkl-rbps. Contact: mehreen.ali@helsinki.fi or tero.aittokallio@fimm.fi. Supplementary information: Supplementary data are available at Bioinformatics online.
Assuntos
Regulação Neoplásica da Expressão Gênica , Espectrometria de Massas/métodos , Neoplasias/genética , Proteômica/métodos , Antineoplásicos/farmacologia , Antineoplásicos/uso terapêutico , Biomarcadores , Linhagem Celular Tumoral , Humanos , Neoplasias/tratamento farmacológico , Neoplasias/metabolismo , Análise Serial de Proteínas/métodosRESUMO
MOTIVATION: A prime challenge in precision cancer medicine is to identify genomic and molecular features that are predictive of drug treatment responses in cancer cells. Although there are several computational models for accurate drug response prediction, these often lack the ability to infer which feature combinations are the most predictive, particularly for high-dimensional molecular datasets. As increasing amounts of diverse genome-wide data sources are becoming available, there is a need to build new computational models that can effectively combine these data sources and identify maximally predictive feature combinations. RESULTS: We present a novel approach that leverages on systematic integration of data sources to identify response predictive features of multiple drugs. To solve the modeling task we implement a Bayesian linear regression method. To further improve the usefulness of the proposed model, we exploit the known human cancer kinome for identifying biologically relevant feature combinations. In case studies with a synthetic dataset and two publicly available cancer cell line datasets, we demonstrate the improved accuracy of our method compared to the widely used approaches in drug response analysis. As key examples, our model identifies meaningful combinations of features for the well known EGFR, ALK, PLK and PDGFR inhibitors. AVAILABILITY AND IMPLEMENTATION: The source code of the method is available at https://github.com/suleimank/mvlr . CONTACT: muhammad.ammad-ud-din@helsinki.fi or suleiman.khan@helsinki.fi. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Antineoplásicos/uso terapêutico , Biologia Computacional/métodos , Modelos Biológicos , Neoplasias/tratamento farmacológico , Medicina de Precisão/métodos , Software , Algoritmos , Antineoplásicos/farmacologia , Teorema de Bayes , Humanos , Modelos Lineares , Neoplasias/genética , Neoplasias/metabolismo , Transdução de Sinais/efeitos dos fármacosRESUMO
BACKGROUND: Improvements to prognostic models in metastatic castration-resistant prostate cancer have the potential to augment clinical trial design and guide treatment strategies. In partnership with Project Data Sphere, a not-for-profit initiative allowing data from cancer clinical trials to be shared broadly with researchers, we designed an open-data, crowdsourced, DREAM (Dialogue for Reverse Engineering Assessments and Methods) challenge to not only identify a better prognostic model for prediction of survival in patients with metastatic castration-resistant prostate cancer but also engage a community of international data scientists to study this disease. METHODS: Data from the comparator arms of four phase 3 clinical trials in first-line metastatic castration-resistant prostate cancer were obtained from Project Data Sphere, comprising 476 patients treated with docetaxel and prednisone from the ASCENT2 trial, 526 patients treated with docetaxel, prednisone, and placebo in the MAINSAIL trial, 598 patients treated with docetaxel, prednisone or prednisolone, and placebo in the VENICE trial, and 470 patients treated with docetaxel and placebo in the ENTHUSE 33 trial. Datasets consisting of more than 150 clinical variables were curated centrally, including demographics, laboratory values, medical history, lesion sites, and previous treatments. Data from ASCENT2, MAINSAIL, and VENICE were released publicly to be used as training data to predict the outcome of interest-namely, overall survival. Clinical data were also released for ENTHUSE 33, but data for outcome variables (overall survival and event status) were hidden from the challenge participants so that ENTHUSE 33 could be used for independent validation. Methods were evaluated using the integrated time-dependent area under the curve (iAUC). The reference model, based on eight clinical variables and a penalised Cox proportional-hazards model, was used to compare method performance. Further validation was done using data from a fifth trial-ENTHUSE M1-in which 266 patients with metastatic castration-resistant prostate cancer were treated with placebo alone. FINDINGS: 50 independent methods were developed to predict overall survival and were evaluated through the DREAM challenge. The top performer was based on an ensemble of penalised Cox regression models (ePCR), which uniquely identified predictive interaction effects with immune biomarkers and markers of hepatic and renal function. Overall, ePCR outperformed all other methods (iAUC 0·791; Bayes factor >5) and surpassed the reference model (iAUC 0·743; Bayes factor >20). Both the ePCR model and reference models stratified patients in the ENTHUSE 33 trial into high-risk and low-risk groups with significantly different overall survival (ePCR: hazard ratio 3·32, 95% CI 2·39-4·62, p<0·0001; reference model: 2·56, 1·85-3·53, p<0·0001). The new model was validated further on the ENTHUSE M1 cohort with similarly high performance (iAUC 0·768). Meta-analysis across all methods confirmed previously identified predictive clinical variables and revealed aspartate aminotransferase as an important, albeit previously under-reported, prognostic biomarker. INTERPRETATION: Novel prognostic factors were delineated, and the assessment of 50 methods developed by independent international teams establishes a benchmark for development of methods in the future. The results of this effort show that data-sharing, when combined with a crowdsourced challenge, is a robust and powerful framework to develop new prognostic models in advanced prostate cancer. FUNDING: Sanofi US Services, Project Data Sphere.
Assuntos
Protocolos de Quimioterapia Combinada Antineoplásica/uso terapêutico , Modelos Estatísticos , Nomogramas , Neoplasias de Próstata Resistentes à Castração/mortalidade , Adolescente , Adulto , Idoso , Teorema de Bayes , Crowdsourcing , Docetaxel , Seguimentos , Humanos , Masculino , Pessoa de Meia-Idade , Estadiamento de Neoplasias , Prednisona/administração & dosagem , Prognóstico , Neoplasias de Próstata Resistentes à Castração/tratamento farmacológico , Neoplasias de Próstata Resistentes à Castração/secundário , Taxa de Sobrevida , Taxoides/administração & dosagem , Adulto JovemRESUMO
MOTIVATION: A key goal of computational personalized medicine is to systematically utilize genomic and other molecular features of samples to predict drug responses for a previously unseen sample. Such predictions are valuable for developing hypotheses for selecting therapies tailored for individual patients. This is especially valuable in oncology, where molecular and genetic heterogeneity of the cells has a major impact on the response. However, the prediction task is extremely challenging, raising the need for methods that can effectively model and predict drug responses. RESULTS: In this study, we propose a novel formulation of multi-task matrix factorization that allows selective data integration for predicting drug responses. To solve the modeling task, we extend the state-of-the-art kernelized Bayesian matrix factorization (KBMF) method with component-wise multiple kernel learning. In addition, our approach exploits the known pathway information in a novel and biologically meaningful fashion to learn the drug response associations. Our method quantitatively outperforms the state of the art on predicting drug responses in two publicly available cancer datasets as well as on a synthetic dataset. In addition, we validated our model predictions with lab experiments using an in-house cancer cell line panel. We finally show the practical applicability of the proposed method by utilizing prior knowledge to infer pathway-drug response associations, opening up the opportunity for elucidating drug action mechanisms. We demonstrate that pathway-response associations can be learned by the proposed model for the well-known EGFR and MEK inhibitors. AVAILABILITY AND IMPLEMENTATION: The source code implementing the method is available at http://research.cs.aalto.fi/pml/software/cwkbmf/ CONTACTS: muhammad.ammad-ud-din@aalto.fi or samuel.kaski@aalto.fi SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Genômica , Neoplasias , Algoritmos , Teorema de Bayes , Sistemas de Liberação de Medicamentos , Descoberta de Drogas , Humanos , Redes e Vias Metabólicas , SoftwareRESUMO
INTRODUCTION: This research analyzed the influence of television on the behavior of children belonging to urban and rural socioeconomic backgrounds of Bhopal city and its vicinity. MATERIALS AND METHODS: About 400 parents with children between 1 and 18 years of age were subjected to a self-designed questionnaire, which sought information regarding the television viewing habits of children. Differences in responses were noted between the subjects of urban and rural areas. Obtained data were subjected to statistical analysis using Pearson's chi-square test to determine the level of significance. RESULTS: The urban class showed a dominating pattern in the positive aspects of television viewing, such as significantly better awareness of oral health, more emphasis shown toward oral care adverts, and a higher knowledge of the cause of dental caries. However, the urban class also possessed a poor attitude; the appearance of a dentist on television did not remind them about oral needs, products with gifts pleased their children to a greater extent, they had more demanding children, parents fulfilled their children's demand more, and they relied on the self for selection of toothpaste. Overall, in all aspects, the rural class lacked significantly. CONCLUSION: Television exerts a positive as well as negative influence on children's behavior among urban and rural communities, with the influence being more obvious in the urban class. CLINICAL SIGNIFICANCE: The results of this study can be utilized in bringing about better and effective advertising oriented toward attaining optimum oral health of children; overall general health through adverts that discourage obesogenic diet and promote a diet rich in protein and fiber can also be focused on.
Assuntos
Publicidade/métodos , Comportamento Infantil/fisiologia , Assistência Odontológica , Comportamentos Relacionados com a Saúde/fisiologia , Educação em Saúde Bucal/métodos , Conhecimentos, Atitudes e Prática em Saúde , Promoção da Saúde/métodos , Saúde Bucal , Classe Social , Televisão , Adolescente , Criança , Pré-Escolar , Cárie Dentária/prevenção & controle , Humanos , Índia , Lactente , População Rural , Inquéritos e Questionários , População UrbanaRESUMO
MOTIVATION: Analysis of relationships of drug structure to biological response is key to understanding off-target and unexpected drug effects, and for developing hypotheses on how to tailor drug therapies. New methods are required for integrated analyses of a large number of chemical features of drugs against the corresponding genome-wide responses of multiple cell models. RESULTS: In this article, we present the first comprehensive multi-set analysis on how the chemical structure of drugs impacts on genome-wide gene expression across several cancer cell lines [Connectivity Map (CMap) database]. The task is formulated as searching for drug response components across multiple cancers to reveal shared effects of drugs and the chemical features that may be responsible. The components can be computed with an extension of a recent approach called Group Factor Analysis. We identify 11 components that link the structural descriptors of drugs with specific gene expression responses observed in the three cell lines and identify structural groups that may be responsible for the responses. Our method quantitatively outperforms the limited earlier methods on CMap and identifies both the previously reported associations and several interesting novel findings, by taking into account multiple cell lines and advanced 3D structural descriptors. The novel observations include: previously unknown similarities in the effects induced by 15-delta prostaglandin J2 and HSP90 inhibitors, which are linked to the 3D descriptors of the drugs; and the induction by simvastatin of leukemia-specific response, resembling the effects of corticosteroids. AVAILABILITY AND IMPLEMENTATION: Source Code implementing the method is available at: http://research.ics.aalto.fi/mi/software/GFAsparse. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Antineoplásicos/química , Antineoplásicos/farmacologia , Teorema de Bayes , Linhagem Celular Tumoral , Expressão Gênica/efeitos dos fármacos , Humanos , Neoplasias/genética , Neoplasias/metabolismo , Relação Estrutura-AtividadeRESUMO
BACKGROUND: Detailed and systematic understanding of the biological effects of millions of available compounds on living cells is a significant challenge. As most compounds impact multiple targets and pathways, traditional methods for analyzing structure-function relationships are not comprehensive enough. Therefore more advanced integrative models are needed for predicting biological effects elicited by specific chemical features. As a step towards creating such computational links we developed a data-driven chemical systems biology approach to comprehensively study the relationship of 76 structural 3D-descriptors (VolSurf, chemical space) of 1159 drugs with the microarray gene expression responses (biological space) they elicited in three cancer cell lines. The analysis covering 11350 genes was based on data from the Connectivity Map. We decomposed the biological response profiles into components, each linked to a characteristic chemical descriptor profile. RESULTS: Integrated analysis of both the chemical and biological space was more informative than either dataset alone in predicting drug similarity as measured by shared protein targets. We identified ten major components that link distinct VolSurf chemical features across multiple compounds to specific cellular responses. For example, component 2 (hydrophobic properties) strongly linked to DNA damage response, while component 3 (hydrogen bonding) was associated with metabolic stress. Individual structural and biological features were often linked to one cell line only, such as leukemia cells (HL-60) specifically responding to cardiac glycosides. CONCLUSIONS: In summary, our approach identified several novel links between specific chemical structure properties and distinct biological responses in cells incubated with these drugs. Importantly, the analysis focused on chemical-biological properties that emerge across multiple drugs. The decoding of such systematic relationships is necessary to build better models of drug effects, including unanticipated types of molecular properties having strong biological effects.
Assuntos
Antineoplásicos/química , Antineoplásicos/farmacologia , Biomarcadores Farmacológicos , Perfilação da Expressão Gênica/estatística & dados numéricos , Neoplasias/genética , Genoma Humano/efeitos dos fármacos , Genoma Humano/genética , Humanos , Análise de Sequência com Séries de Oligonucleotídeos , Relação Estrutura-Atividade , Biologia de Sistemas/métodos , TranscriptomaRESUMO
The FDA recently approved eight targeted therapies for acute myeloid leukemia (AML), including the BCL-2 inhibitor venetoclax. Maximizing efficacy of these treatments requires refining patient selection. To this end, we analyzed two recent AML studies profiling the gene expression and ex vivo drug response of primary patient samples. We find that ex vivo samples often exhibit a general sensitivity to (any) drug exposure, independent of drug target. We observe that this "general response across drugs" (GRD) is associated with FLT3-ITD mutations, clinical response to standard induction chemotherapy, and overall survival. Further, incorporating GRD into expression-based regression models trained on one of the studies improved their performance in predicting ex vivo response in the second study, thus signifying its relevance to precision oncology efforts. We find that venetoclax response is independent of GRD but instead show that it is linked to expression of monocyte-associated genes by developing and applying a multi-source Bayesian regression approach. The method shares information across studies to robustly identify biomarkers of drug response and is broadly applicable in integrative analyses.
RESUMO
We combined clinical, cytokine, genomic, methylation and dietary data from 43 young adult monozygotic twin pairs (aged 22-36 years, 53% female), where 25 of the twin pairs were substantially weight discordant (delta body mass index > 3 kg m-2). These measurements were originally taken as part of the TwinFat study, a substudy of The Finnish Twin Cohort study. These five large multivariate datasets (comprising 42, 71, 1587, 1605 and 63 variables, respectively) were jointly analysed using an integrative machine learning method called group factor analysis (GFA) to offer new hypotheses into the multi-molecular-level interactions associated with the development of obesity. New potential links between cytokines and weight gain are identified, as well as associations between dietary, inflammatory and epigenetic factors. This encouraging case study aims to enthuse the research community to boldly attempt new machine learning approaches which have the potential to yield novel and unintuitive hypotheses. The source code of the GFA method is publically available as the R package GFA.
RESUMO
We report the results of a DREAM challenge designed to predict relative genetic essentialities based on a novel dataset testing 98,000 shRNAs against 149 molecularly characterized cancer cell lines. We analyzed the results of over 3,000 submissions over a period of 4 months. We found that algorithms combining essentiality data across multiple genes demonstrated increased accuracy; gene expression was the most informative molecular data type; the identity of the gene being predicted was far more important than the modeling strategy; well-predicted genes and selected molecular features showed enrichment in functional categories; and frequently selected expression features correlated with survival in primary tumors. This study establishes benchmarks for gene essentiality prediction, presents a community resource for future comparison with this benchmark, and provides insights into factors influencing the ability to predict gene essentiality from functional genetic screens. This study also demonstrates the value of releasing pre-publication data publicly to engage the community in an open research collaboration.
Assuntos
Expressão Gênica/genética , Genes Essenciais/genética , Algoritmos , Linhagem Celular Tumoral , Genômica/métodos , Humanos , RNA Interferente Pequeno/genéticaRESUMO
Drug discovery is moving away from the single target-based approach towards harnessing the potential of polypharmacological agents that modulate the activity of multiple nodes in the complex networks of deregulations underlying disease phenotypes. Computational network pharmacology methods that use systems-level drug-response phenotypes, such as those originating from genome-wide transcriptomic profiles, have proved particularly effective for elucidating the mechanisms of action of multitargeted compounds. Here, we show, via the case study of the natural product pinosylvin, how the combination of two complementary network-based methods can provide novel, unexpected mechanistic insights. This case study also illustrates that elucidating the mechanism of action of multitargeted natural products through transcriptional response-based approaches is a challenging endeavor, often requiring multiple computational-experimental iterations.
Assuntos
Descoberta de Drogas , Redes Reguladoras de Genes , Animais , Biologia Computacional , HumanosRESUMO
Deconvoluting the molecular target signals behind observed drug response phenotypes is an important part of phenotype-based drug discovery and repurposing efforts. We demonstrate here how our network-based deconvolution approach, named target addiction score (TAS), provides insights into the functional importance of druggable protein targets in cell-based drug sensitivity testing experiments. Using cancer cell line profiling data sets, we constructed a functional classification across 107 cancer cell models, based on their common and unique target addiction signatures. The pan-cancer addiction correlations could not be explained by the tissue of origin, and only correlated in part with molecular and genomic signatures of the heterogeneous cancer cells. The TAS-based cancer cell classification was also shown to be robust to drug response data resampling, as well as predictive of the transcriptomic patterns in an independent set of cancer cells that shared similar addiction signatures with the 107 cancers. The critical protein targets identified by the integrated approach were also shown to have clinically relevant mutation frequencies in patients with various cancer subtypes, including not only well-established pan-cancer genes, such as PTEN tumor suppressor, but also a number of targets that are less frequently mutated in specific cancer types, including ABL1 oncoprotein in acute myeloid leukemia. An application to leukemia patient primary cell models demonstrated how the target deconvolution approach offers functional insights into patient-specific addiction patterns, such as those indicative of their receptor-type tyrosine-protein kinase FLT3 internal tandem duplication (FLT3-ITD) status and co-addiction partners, which may lead to clinically actionable, personalized drug treatment developments. To promote its application to the future drug testing studies, we have made available an open-source implementation of the TAS calculation in the form of a stand-alone R package.
Assuntos
Antineoplásicos/uso terapêutico , Sistemas de Liberação de Medicamentos , Leucemia/tratamento farmacológico , Modelos Biológicos , Linhagem Celular Tumoral , Perfilação da Expressão Gênica , Humanos , Leucemia/patologia , Especificidade de ÓrgãosRESUMO
Predicting the best treatment strategy from genomic information is a core goal of precision medicine. Here we focus on predicting drug response based on a cohort of genomic, epigenomic and proteomic profiling data sets measured in human breast cancer cell lines. Through a collaborative effort between the National Cancer Institute (NCI) and the Dialogue on Reverse Engineering Assessment and Methods (DREAM) project, we analyzed a total of 44 drug sensitivity prediction algorithms. The top-performing approaches modeled nonlinear relationships and incorporated biological pathway information. We found that gene expression microarrays consistently provided the best predictive power of the individual profiling data sets; however, performance was increased by including multiple, independent data sets. We discuss the innovations underlying the top-performing methodology, Bayesian multitask MKL, and we provide detailed descriptions of all methods. This study establishes benchmarks for drug sensitivity prediction and identifies approaches that can be leveraged for the development of new methods.