Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 9 de 9
Filter
1.
Brief Bioinform ; 23(1)2022 01 17.
Article in English | MEDLINE | ID: mdl-34524425

ABSTRACT

To enable personalized cancer treatment, machine learning models have been developed to predict drug response as a function of tumor and drug features. However, most algorithm development efforts have relied on cross-validation within a single study to assess model accuracy. While an essential first step, cross-validation within a biological data set typically provides an overly optimistic estimate of the prediction performance on independent test sets. To provide a more rigorous assessment of model generalizability between different studies, we use machine learning to analyze five publicly available cell line-based data sets: National Cancer Institute 60, ancer Therapeutics Response Portal (CTRP), Genomics of Drug Sensitivity in Cancer, Cancer Cell Line Encyclopedia and Genentech Cell Line Screening Initiative (gCSI). Based on observed experimental variability across studies, we explore estimates of prediction upper bounds. We report performance results of a variety of machine learning models, with a multitasking deep neural network achieving the best cross-study generalizability. By multiple measures, models trained on CTRP yield the most accurate predictions on the remaining testing data, and gCSI is the most predictable among the cell line data sets included in this study. With these experiments and further simulations on partial data, two lessons emerge: (1) differences in viability assays can limit model generalizability across studies and (2) drug diversity, more than tumor diversity, is crucial for raising model generalizability in preclinical screening.


Subject(s)
Neoplasms , Algorithms , Cell Line , Humans , Machine Learning , Neoplasms/drug therapy , Neoplasms/genetics , Neural Networks, Computer
2.
Nat Mater ; 19(1): 63-68, 2020 01.
Article in English | MEDLINE | ID: mdl-31636421

ABSTRACT

The intercalation of alkali ions into layered materials has played an essential role in battery technology since the development of the first lithium-ion electrodes. Coulomb repulsion between the intercalants leads to ordering of the intercalant sublattice, which hinders ionic diffusion and impacts battery performance. While conventional diffraction can identify the long-range order that can occur at discrete intercalant concentrations during the charging cycle, it cannot determine short-range order at other concentrations that also disrupt ionic mobility. In this Article, we show that the use of real-space transforms of single-crystal diffuse scattering, measured with high-energy synchrotron X-rays, allows a model-independent measurement of the temperature dependence of the length scale of ionic correlations along each of the crystallographic axes in sodium-intercalated V2O5. The techniques described here provide a new way of probing the evolution of structural ordering in crystalline materials.

3.
Proc Natl Acad Sci U S A ; 115(7): E1384-E1390, 2018 02 13.
Article in English | MEDLINE | ID: mdl-29382758

ABSTRACT

Recent theoretical work suggests that systematic pruning of disordered networks consisting of nodes connected by springs can lead to materials that exhibit a host of unusual mechanical properties. In particular, global properties such as Poisson's ratio or local responses related to deformation can be precisely altered. Tunable mechanical responses would be useful in areas ranging from impact mitigation to robotics and, more generally, for creation of metamaterials with engineered properties. However, experimental attempts to create auxetic materials based on pruning-based theoretical ideas have not been successful. Here we introduce a more realistic model of the networks, which incorporates angle-bending forces and the appropriate experimental boundary conditions. A sequential pruning strategy of select bonds in this model is then devised and implemented that enables engineering of specific mechanical behaviors upon deformation, both in the linear and in the nonlinear regimes. In particular, it is shown that Poisson's ratio can be tuned to arbitrary values. The model and concepts discussed here are validated by preparing physical realizations of the networks designed in this manner, which are produced by laser cutting 2D sheets and are found to behave as predicted. Furthermore, by relying on optimization algorithms, we exploit the networks' susceptibility to tuning to design networks that possess a distribution of stiffer and more compliant bonds and whose auxetic behavior is even greater than that of homogeneous networks. Taken together, the findings reported here serve to establish that pruned networks represent a promising platform for the creation of unique mechanical metamaterials.

4.
Nat Mater ; 18(12): 1384, 2019 Dec.
Article in English | MEDLINE | ID: mdl-31666686

ABSTRACT

An amendment to this paper has been published and can be accessed via a link at the top of the paper.

5.
BMC Bioinformatics ; 19(Suppl 18): 483, 2018 Dec 21.
Article in English | MEDLINE | ID: mdl-30577742

ABSTRACT

BACKGROUND: Cancer is a complex, multiscale dynamical system, with interactions between tumor cells and non-cancerous host systems. Therapies act on this combined cancer-host system, sometimes with unexpected results. Systematic investigation of mechanistic computational models can augment traditional laboratory and clinical studies, helping identify the factors driving a treatment's success or failure. However, given the uncertainties regarding the underlying biology, these multiscale computational models can take many potential forms, in addition to encompassing high-dimensional parameter spaces. Therefore, the exploration of these models is computationally challenging. We propose that integrating two existing technologies-one to aid the construction of multiscale agent-based models, the other developed to enhance model exploration and optimization-can provide a computational means for high-throughput hypothesis testing, and eventually, optimization. RESULTS: In this paper, we introduce a high throughput computing (HTC) framework that integrates a mechanistic 3-D multicellular simulator (PhysiCell) with an extreme-scale model exploration platform (EMEWS) to investigate high-dimensional parameter spaces. We show early results in applying PhysiCell-EMEWS to 3-D cancer immunotherapy and show insights on therapeutic failure. We describe a generalized PhysiCell-EMEWS workflow for high-throughput cancer hypothesis testing, where hundreds or thousands of mechanistic simulations are compared against data-driven error metrics to perform hypothesis optimization. CONCLUSIONS: While key notational and computational challenges remain, mechanistic agent-based models and high-throughput model exploration environments can be combined to systematically and rapidly explore key problems in cancer. These high-throughput computational experiments can improve our understanding of the underlying biology, drive future experiments, and ultimately inform clinical practice.


Subject(s)
Neoplasms/diagnosis , Humans , Models, Theoretical , Workflow
6.
BMC Bioinformatics ; 19(Suppl 18): 491, 2018 Dec 21.
Article in English | MEDLINE | ID: mdl-30577736

ABSTRACT

BACKGROUND: Current multi-petaflop supercomputers are powerful systems, but present challenges when faced with problems requiring large machine learning workflows. Complex algorithms running at system scale, often with different patterns that require disparate software packages and complex data flows cause difficulties in assembling and managing large experiments on these machines. RESULTS: This paper presents a workflow system that makes progress on scaling machine learning ensembles, specifically in this first release, ensembles of deep neural networks that address problems in cancer research across the atomistic, molecular and population scales. The initial release of the application framework that we call CANDLE/Supervisor addresses the problem of hyper-parameter exploration of deep neural networks. CONCLUSIONS: Initial results demonstrating CANDLE on DOE systems at ORNL, ANL and NERSC (Titan, Theta and Cori, respectively) demonstrate both scaling and multi-platform execution.


Subject(s)
Early Detection of Cancer/methods , Machine Learning/trends , Neoplasms/diagnosis , Humans , Neoplasms/pathology , Neural Networks, Computer , Workflow
7.
PLoS One ; 14(7): e0211608, 2019.
Article in English | MEDLINE | ID: mdl-31287816

ABSTRACT

Bioinformatics research is frequently performed using complex workflows with multiple steps, fans, merges, and conditionals. This complexity makes management of the workflow difficult on a computer cluster, especially when running in parallel on large batches of data: hundreds or thousands of samples at a time. Scientific workflow management systems could help with that. Many are now being proposed, but is there yet the "best" workflow management system for bioinformatics? Such a system would need to satisfy numerous, sometimes conflicting requirements: from ease of use, to seamless deployment at peta- and exa-scale, and portability to the cloud. We evaluated Swift/T as a candidate for such role by implementing a primary genomic variant calling workflow in the Swift/T language, focusing on workflow management, performance and scalability issues that arise from production-grade big data genomic analyses. In the process we introduced novel features into the language, which are now part of its open repository. Additionally, we formalized a set of design criteria for quality, robust, maintainable workflows that must function at-scale in a production setting, such as a large genomic sequencing facility or a major hospital system. The use of Swift/T conveys two key advantages. (1) It operates transparently in multiple cluster scheduling environments (PBS Torque, SLURM, Cray aprun environment, etc.), thus a single workflow is trivially portable across numerous clusters. (2) The leaf functions of Swift/T permit developers to easily swap executables in and out of the workflow, which makes it easy to maintain and to request resources optimal for each stage of the pipeline. While Swift/T's data-level parallelism eliminates the need to code parallel analysis of multiple samples, it does make debugging more difficult, as is common for implicitly parallel code. Nonetheless, the language gives users a powerful and portable way to scale up analyses in many computing architectures. The code for our implementation of a variant calling workflow using Swift/T can be found on GitHub at https://github.com/ncsa/Swift-T-Variant-Calling, with full documentation provided at http://swift-t-variant-calling.readthedocs.io/en/latest/.


Subject(s)
Computational Biology , Genomics , Software , Animals , Humans , Workflow
8.
IEEE Trans Comput Soc Syst ; 5(3): 884-895, 2018 Sep.
Article in English | MEDLINE | ID: mdl-30349868

ABSTRACT

Agent-based models (ABMs) integrate multiple scales of behavior and data to produce higher-order dynamic phenomena and are increasingly used in the study of important social complex systems in biomedicine, socio-economics and ecology/resource management. However, the development, validation and use of ABMs is hampered by the need to execute very large numbers of simulations in order to identify their behavioral properties, a challenge accentuated by the computational cost of running realistic, large-scale, potentially distributed ABM simulations. In this paper we describe the Extreme-scale Model Exploration with Swift (EMEWS) framework, which is capable of efficiently composing and executing large ensembles of simulations and other "black box" scientific applications while integrating model exploration (ME) algorithms developed with the use of widely available 3rd-party libraries written in popular languages such as R and Python. EMEWS combines novel stateful tasks with traditional run-to-completion many task computing (MTC) and solves many problems relevant to high-performance workflows, including scaling to very large numbers (millions) of tasks, maintaining state and locality information, and enabling effective multiple-language problem solving. We present the high-level programming model of the EMEWS framework and demonstrate how it is used to integrate an active learning ME algorithm to dynamically and efficiently characterize the parameter space of a large and complex, distributed Message Passing Interface (MPI) agent-based infectious disease model.

9.
Proc Winter Simul Conf ; 2016: 206-220, 2016 Dec.
Article in English | MEDLINE | ID: mdl-31239603

ABSTRACT

As high-performance computing resources have become increasingly available, new modes of computational processing and experimentation have become possible. This tutorial presents the Extreme-scale Model Exploration with Swift/T (EMEWS) framework for combining existing capabilities for model exploration approaches (e.g., model calibration, metaheuristics, data assimilation) and simulations (or any "black box" application code) with the Swift/T parallel scripting language to run scientific workflows on a variety of computing resources, from desktop to academic clusters to Top 500 level supercomputers. We will present a number of use-cases, starting with a simple agent-based model parameter sweep, and ending with a complex adaptive parameter space exploration workflow coordinating ensembles of distributed simulations. The use-cases are published on a public repository for interested parties to download and run on their own.

SELECTION OF CITATIONS
SEARCH DETAIL