Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 9 de 9
Filter
Add more filters










Database
Language
Publication year range
1.
Sci Robot ; 7(69): eabo0235, 2022 08 31.
Article in English | MEDLINE | ID: mdl-36044556

ABSTRACT

Learning to combine control at the level of joint torques with longer-term goal-directed behavior is a long-standing challenge for physically embodied artificial agents. Intelligent behavior in the physical world unfolds across multiple spatial and temporal scales: Although movements are ultimately executed at the level of instantaneous muscle tensions or joint torques, they must be selected to serve goals that are defined on much longer time scales and that often involve complex interactions with the environment and other agents. Recent research has demonstrated the potential of learning-based approaches applied to the respective problems of complex movement, long-term planning, and multiagent coordination. However, their integration traditionally required the design and optimization of independent subsystems and remains challenging. In this work, we tackled the integration of motor control and long-horizon decision-making in the context of simulated humanoid football, which requires agile motor control and multiagent coordination. We optimized teams of agents to play simulated football via reinforcement learning, constraining the solution space to that of plausible movements learned using human motion capture data. They were trained to maximize several environment rewards and to imitate pretrained football-specific skills if doing so led to improved performance. The result is a team of coordinated humanoid football players that exhibit complex behavior at different scales, quantified by a range of analysis and statistics, including those used in real-world sport analytics. Our work constitutes a complete demonstration of learned integrated decision-making at multiple scales in a multiagent setting.


Subject(s)
Football , Soccer , Humans , Learning , Movement , Reinforcement, Psychology , Soccer/physiology
2.
Nat Commun ; 11(1): 5603, 2020 11 05.
Article in English | MEDLINE | ID: mdl-33154362

ABSTRACT

Multiplayer games have long been used as testbeds in artificial intelligence research, aptly referred to as the Drosophila of artificial intelligence. Traditionally, researchers have focused on using well-known games to build strong agents. This progress, however, can be better informed by characterizing games and their topological landscape. Tackling this latter question can facilitate understanding of agents and help determine what game an agent should target next as part of its training. Here, we show how network measures applied to response graphs of large-scale games enable the creation of a landscape of games, quantifying relationships between games of varying sizes and characteristics. We illustrate our findings in domains ranging from canonical games to complex empirical games capturing the performance of trained agents pitted against one another. Our results culminate in a demonstration leveraging this information to generate new and interesting games, including mixtures of empirical games synthesized from real world games.

3.
Nature ; 575(7782): 350-354, 2019 11.
Article in English | MEDLINE | ID: mdl-31666705

ABSTRACT

Many real-world applications require artificial agents to compete and coordinate with other agents in complex environments. As a stepping stone to this goal, the domain of StarCraft has emerged as an important challenge for artificial intelligence research, owing to its iconic and enduring status among the most difficult professional esports and its relevance to the real world in terms of its raw complexity and multi-agent challenges. Over the course of a decade and numerous competitions1-3, the strongest agents have simplified important aspects of the game, utilized superhuman capabilities, or employed hand-crafted sub-systems4. Despite these advantages, no previous agent has come close to matching the overall skill of top StarCraft players. We chose to address the challenge of StarCraft using general-purpose learning methods that are in principle applicable to other complex domains: a multi-agent reinforcement learning algorithm that uses data from both human and agent games within a diverse league of continually adapting strategies and counter-strategies, each represented by deep neural networks5,6. We evaluated our agent, AlphaStar, in the full game of StarCraft II, through a series of online games against human players. AlphaStar was rated at Grandmaster level for all three StarCraft races and above 99.8% of officially ranked human players.


Subject(s)
Reinforcement, Psychology , Video Games , Artificial Intelligence , Humans , Learning
4.
Sci Rep ; 9(1): 9937, 2019 07 09.
Article in English | MEDLINE | ID: mdl-31289288

ABSTRACT

We introduce α-Rank, a principled evolutionary dynamics methodology, for the evaluation and ranking of agents in large-scale multi-agent interactions, grounded in a novel dynamical game-theoretic solution concept called Markov-Conley chains (MCCs). The approach leverages continuous-time and discrete-time evolutionary dynamical systems applied to empirical games, and scales tractably in the number of agents, in the type of interactions (beyond dyadic), and the type of empirical games (symmetric and asymmetric). Current models are fundamentally limited in one or more of these dimensions, and are not guaranteed to converge to the desired game-theoretic solution concept (typically the Nash equilibrium). α-Rank automatically provides a ranking over the set of agents under evaluation and provides insights into their strengths, weaknesses, and long-term dynamics in terms of basins of attraction and sink components. This is a direct consequence of the correspondence we establish to the dynamical MCC solution concept when the underlying evolutionary model's ranking-intensity parameter, α, is chosen to be large, which exactly forms the basis of α-Rank. In contrast to the Nash equilibrium, which is a static solution concept based solely on fixed points, MCCs are a dynamical solution concept based on the Markov chain formalism, Conley's Fundamental Theorem of Dynamical Systems, and the core ingredients of dynamical systems: fixed points, recurrent sets, periodic orbits, and limit cycles. Our α-Rank method runs in polynomial time with respect to the total number of pure strategy profiles, whereas computing a Nash equilibrium for a general-sum game is known to be intractable. We introduce mathematical proofs that not only provide an overarching and unifying perspective of existing continuous- and discrete-time evolutionary evaluation models, but also reveal the formal underpinnings of the α-Rank methodology. We illustrate the method in canonical games and empirically validate it in several domains, including AlphaGo, AlphaZero, MuJoCo Soccer, and Poker.

5.
Science ; 364(6443): 859-865, 2019 May 31.
Article in English | MEDLINE | ID: mdl-31147514

ABSTRACT

Reinforcement learning (RL) has shown great success in increasingly complex single-agent environments and two-player turn-based games. However, the real world contains multiple agents, each learning and acting independently to cooperate and compete with other agents. We used a tournament-style evaluation to demonstrate that an agent can achieve human-level performance in a three-dimensional multiplayer first-person video game, Quake III Arena in Capture the Flag mode, using only pixels and game points scored as input. We used a two-tier optimization process in which a population of independent RL agents are trained concurrently from thousands of parallel matches on randomly generated environments. Each agent learns its own internal reward signal and rich representation of the world. These results indicate the great potential of multiagent reinforcement learning for artificial intelligence research.


Subject(s)
Machine Learning , Reinforcement, Psychology , Video Games , Reward
6.
J Chem Inf Model ; 57(2): 133-147, 2017 02 27.
Article in English | MEDLINE | ID: mdl-28158942

ABSTRACT

The growing computational abilities of various tools that are applied in the broadly understood field of computer-aided drug design have led to the extreme popularity of virtual screening in the search for new biologically active compounds. Most often, the source of such molecules consists of commercially available compound databases, but they can also be searched for within the libraries of structures generated in silico from existing ligands. Various computational combinatorial approaches are based solely on the chemical structure of compounds, using different types of substitutions for new molecules formation. In this study, the starting point for combinatorial library generation was the fingerprint referring to the optimal substructural composition in terms of the activity toward a considered target, which was obtained using a machine learning-based optimization procedure. The systematic enumeration of all possible connections between preferred substructures resulted in the formation of target-focused libraries of new potential ligands. The compounds were initially assessed by machine learning methods using a hashed fingerprint to represent molecules; the distribution of their physicochemical properties was also investigated, as well as their synthetic accessibility. The examination of various fingerprints and machine learning algorithms indicated that the Klekota-Roth fingerprint and support vector machine were an optimal combination for such experiments. This study was performed for 8 protein targets, and the obtained compound sets and their characterization are publically available at http://skandal.if-pan.krakow.pl/comb_lib/ .


Subject(s)
Combinatorial Chemistry Techniques/methods , Drug Design , Machine Learning , Small Molecule Libraries/chemistry , Computer-Aided Design , Databases, Pharmaceutical
7.
Bioorg Med Chem Lett ; 27(3): 626-631, 2017 02 01.
Article in English | MEDLINE | ID: mdl-27993519

ABSTRACT

Exponential growth in the number of compounds with experimentally verified activity towards particular target has led to the emergence of various databases gathering data on biological activity. In this study, the ligands of family A of the G Protein-Coupled Receptors that are collected in the ChEMBL database were examined, and special attention was given to serotonin receptors. Sets of compounds were examined in terms of their appearance over time, they were mapped to the chemical space of drugs deposited in DrugBank, and the emergence of structurally new clusters of compounds was indicated. In addition, a tool for detailed analysis of the obtained visualizations was prepared and made available online at http://chem.gmum.net/vischem, which enables the investigation of chemical structures while referring to particular data points depicted in the figures and changes in compounds datasets over time.


Subject(s)
Ligands , Receptors, G-Protein-Coupled/metabolism , Databases, Chemical , Internet , Protein Binding , Receptors, G-Protein-Coupled/chemistry , Receptors, Serotonin/chemistry , Receptors, Serotonin/metabolism , User-Computer Interface
8.
Molecules ; 20(11): 20107-17, 2015 Nov 09.
Article in English | MEDLINE | ID: mdl-26569196

ABSTRACT

Speed, a relatively low requirement for computational resources and high effectiveness of the evaluation of the bioactivity of compounds have caused a rapid growth of interest in the application of machine learning methods to virtual screening tasks. However, due to the growth of the amount of data also in cheminformatics and related fields, the aim of research has shifted not only towards the development of algorithms of high predictive power but also towards the simplification of previously existing methods to obtain results more quickly. In the study, we tested two approaches belonging to the group of so-called 'extremely randomized methods'-Extreme Entropy Machine and Extremely Randomized Trees-for their ability to properly identify compounds that have activity towards particular protein targets. These methods were compared with their 'non-extreme' competitors, i.e., Support Vector Machine and Random Forest. The extreme approaches were not only found out to improve the efficiency of the classification of bioactive compounds, but they were also proved to be less computationally complex, requiring fewer steps to perform an optimization procedure.


Subject(s)
Drug Discovery/methods , Machine Learning , Models, Theoretical , Algorithms , Datasets as Topic
9.
J Cheminform ; 7: 38, 2015.
Article in English | MEDLINE | ID: mdl-26273325

ABSTRACT

BACKGROUND: Support Vector Machine has become one of the most popular machine learning tools used in virtual screening campaigns aimed at finding new drug candidates. Although it can be extremely effective in finding new potentially active compounds, its application requires the optimization of the hyperparameters with which the assessment is being run, particularly the C and [Formula: see text] values. The optimization requirement in turn, establishes the need to develop fast and effective approaches to the optimization procedure, providing the best predictive power of the constructed model. RESULTS: In this study, we investigated the Bayesian and random search optimization of Support Vector Machine hyperparameters for classifying bioactive compounds. The effectiveness of these strategies was compared with the most popular optimization procedures-grid search and heuristic choice. We demonstrated that Bayesian optimization not only provides better, more efficient classification but is also much faster-the number of iterations it required for reaching optimal predictive performance was the lowest out of the all tested optimization methods. Moreover, for the Bayesian approach, the choice of parameters in subsequent iterations is directed and justified; therefore, the results obtained by using it are constantly improved and the range of hyperparameters tested provides the best overall performance of Support Vector Machine. Additionally, we showed that a random search optimization of hyperparameters leads to significantly better performance than grid search and heuristic-based approaches. CONCLUSIONS: The Bayesian approach to the optimization of Support Vector Machine parameters was demonstrated to outperform other optimization methods for tasks concerned with the bioactivity assessment of chemical compounds. This strategy not only provides a higher accuracy of classification, but is also much faster and more directed than other approaches for optimization. It appears that, despite its simplicity, random search optimization strategy should be used as a second choice if Bayesian approach application is not feasible.Graphical abstractThe improvement of classification accuracy obtained after the application of Bayesian approach to the optimization of Support Vector Machines parameters.

SELECTION OF CITATIONS
SEARCH DETAIL
...