RESUMO
The recent trend in using network and graph structures to represent a variety of different data types has renewed interest in the graph partitioning (GP) problem. This interest stems from the need for general methods that can both efficiently identify network communities and reduce the dimensionality of large graphs while satisfying various application-specific criteria. Traditional clustering algorithms often struggle to capture the complex relationships within graphs and generalize to arbitrary clustering criteria. The emergence of graph neural networks (GNNs) as a powerful framework for learning representations of graph data provides new approaches to solving the problem. Previous work has shown GNNs to be capable of proposing partitionings using a variety of criteria. However, these approaches have not yet been extended to Markov chains or kinetic networks. These arise frequently in the study of molecular systems and are of particular interest to the biomolecular modeling community. In this work, we propose several GNN-based architectures to tackle the GP problem for Markov Chains described as kinetic networks. This approach aims to maximize the Kemeny constant, which is a variational quantity and it represents the sum of time scales of the system. We propose using an encoder-decoder architecture and show how simple GraphSAGE-based GNNs with linear layers can outperform much larger and more expressive attention-based models in this context. As a proof of concept, we first demonstrate the method's ability to cluster randomly connected graphs. We also use a linear chain architecture corresponding to a 1D free energy profile as our kinetic network. Subsequently, we demonstrate the effectiveness of our method through experiments on a data set derived from molecular dynamics. We compare the performance of our method to other partitioning techniques, such as PCCA+. We explore the importance of feature and hyperparameter selection and propose a general strategy for large-scale parallel training of GNNs for discovering optimal graph partitionings.
RESUMO
The RNA helicase (non-structural protein 13, NSP13) of SARS-CoV-2 is essential for viral replication, and it is highly conserved among the coronaviridae family, thus a prominent drug target to treat COVID-19. We present here structural models and dynamics of the helicase in complex with its native substrates based on thorough analysis of homologous sequences and existing experimental structures. We performed and analysed microseconds of molecular dynamics (MD) simulations, and our model provides valuable insights to the binding of the ATP and ssRNA at the atomic level. We identify the principal motions characterising the enzyme and highlight the effect of the natural substrates on this dynamics. Furthermore, allosteric binding sites are suggested by our pocket analysis. Our obtained structural and dynamical insights are important for subsequent studies of the catalytic function and for the development of specific inhibitors at our characterised binding pockets for this promising COVID-19 drug target.