Pesquisa | Portal Regional da BVS

1.

KegAlign: Optimizing pairwise alignments with diagonal partitioning.

Gulhan, A Burak; Burhans, Richard; Harris, Robert; Kandemir, Mahmut; Haeussler, Maximilian; Nekrutenko, Anton.

bioRxiv ; 2024 Sep 03.

Artigo em Inglês | MEDLINE | ID: mdl-39282333

RESUMO

Our ability to generate sequencing data and assemble it into high quality complete genomes has rapidly advanced in recent years. These data promise to advance our understanding of organismal biology and answer longstanding evolutionary questions. Multiple genome alignment is a key tool in this quest. It is also the area which is lagging: today we can generate genomes faster than we can construct and update multiple alignments containing them. The bottleneck is in considerable computational time required to generate accurate pairwise alignments between divergent genomes, an unavoidable precursor to multiple alignments. This step is typically performed with lastZ, a very sensitive and yet equally slow tool. Here we describe an optimized GPU-enabled pairwise aligner KegAlign. It incorporates a new parallelization strategy, diagonal partitioning, with the latest features of modern GPUs. With KegAlign a typical human/mouse alignment can be computed in under 6 hours on a machine containing a single NVidia A100 GPU and 80 CPU cores without the need for any pre-partitioning of input sequences: a ~150× improvement over lastZ. While other pairwise aligners can complete this task in a fraction of that time, none achieves the sensitivity of KegAlign's main alignment engine, lastZ, and thus may not be suitable for comparing divergent genomes. In addition to providing the source code and a Conda package for KegAlign we also provide a Galaxy workflow that can be readily used by anyone.

2.

Predicting Protein-Ligand Docking Structure with Graph Neural Network.

Jiang, Huaipan; Wang, Jian; Cong, Weilin; Huang, Yihe; Ramezani, Morteza; Sarma, Anup; Dokholyan, Nikolay V; Mahdavi, Mehrdad; Kandemir, Mahmut T.

J Chem Inf Model ; 62(12): 2923-2932, 2022 06 27.

Artigo em Inglês | MEDLINE | ID: mdl-35699430

RESUMO

Modern day drug discovery is extremely expensive and time consuming. Although computational approaches help accelerate and decrease the cost of drug discovery, existing computational software packages for docking-based drug discovery suffer from both low accuracy and high latency. A few recent machine learning-based approaches have been proposed for virtual screening by improving the ability to evaluate protein-ligand binding affinity, but such methods rely heavily on conventional docking software to sample docking poses, which results in excessive execution latencies. Here, we propose and evaluate a novel graph neural network (GNN)-based framework, MedusaGraph, which includes both pose-prediction (sampling) and pose-selection (scoring) models. Unlike the previous machine learning-centric studies, MedusaGraph generates the docking poses directly and achieves from 10 to 100 times speedup compared to state-of-the-art approaches, while having a slightly better docking accuracy.

Assuntos

Redes Neurais de Computação , Proteínas , Ligantes , Simulação de Acoplamento Molecular , Ligação Proteica , Proteínas/química

3.

GPU-accelerated and pipelined methylation calling.

Feng, Yilin; Gudukbay Akbulut, Gulsum; Tang, Xulong; Gunasekaran, Jashwant Raj; Rahman, Amatur; Medvedev, Paul; Kandemir, Mahmut.

Bioinform Adv ; 2(1): vbac088, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-36699365

RESUMO

Motivation: The third-generation DNA sequencing technologies, such as Nanopore Sequencing, can operate at very high speeds and produce longer reads, which in turn results in a challenge for the computational analysis of such massive data. Nanopolish is a software package for signal-level analysis of Oxford Nanopore sequencing data. Call-methylation module of Nanopolish can detect methylation based on Hidden Markov Model (HMM). However, Nanopolish is limited by the long running time of some serial and computationally expensive processes. Among these, Adaptive Banded Event Alignment (ABEA) is the most time-consuming step, and the prior work, f5c, has already parallelized and optimized ABEA on GPU. As a result, the remaining methylation score calculation part, which uses HMM to identify if a given base is methylated or not, has become the new performance bottleneck. Results: This article focuses on the call-methylation module that resides in the Nanopolish package. We propose Galaxy-methyl, which parallelizes and optimizes the methylation score calculation step on GPU and then pipelines the four steps of the call-methylation module. Galaxy-methyl increases the execution concurrency across CPUs and GPUs as well as hardware resource utilization for both. The experimental results collected indicate that Galaxy-methyl can achieve 3×-5× speedup compared with Nanopolish, and reduce the total execution time by 35% compared with f5c, on average. Availability and implementation: The source code of Galaxy-methyl is available at https://github.com/fengyilin118/.

4.

GPU-Accelerated Flexible Molecular Docking.

Fan, Mengran; Wang, Jian; Jiang, Huaipan; Feng, Yilin; Mahdavi, Mehrdad; Madduri, Kamesh; Kandemir, Mahmut T; Dokholyan, Nikolay V.

J Phys Chem B ; 125(4): 1049-1060, 2021 02 04.

Artigo em Inglês | MEDLINE | ID: mdl-33497567

RESUMO

Virtual screening is a key enabler of computational drug discovery and requires accurate and efficient structure-based molecular docking. In this work, we develop algorithms and software building blocks for molecular docking that can take advantage of graphics processing units (GPUs). Specifically, we focus on MedusaDock, a flexible protein-small molecule docking approach and platform. We accelerate the performance of the coarse docking phase of MedusaDock, as this step constitutes nearly 70% of total running time in typical use-cases. We perform a comprehensive evaluation of the quality and performance with single-GPU and multi-GPU acceleration using a data set of 3875 protein-ligand complexes. The algorithmic ideas, data structure design choices, and performance optimization techniques shed light on GPU acceleration of other structure-based molecular docking software tools.

Assuntos

Algoritmos , Software , Gráficos por Computador , Ligantes , Simulação de Acoplamento Molecular , Proteínas

5.

Guiding Conventional Protein-Ligand Docking Software with Convolutional Neural Networks.

Jiang, Huaipan; Fan, Mengran; Wang, Jian; Sarma, Anup; Mohanty, Shruti; Dokholyan, Nikolay V; Mahdavi, Mehrdad; Kandemir, Mahmut T.

J Chem Inf Model ; 60(10): 4594-4602, 2020 10 26.

Artigo em Inglês | MEDLINE | ID: mdl-33100014

RESUMO

The high-performance computational techniques have brought significant benefits for drug discovery efforts in recent decades. One of the most challenging problems in drug discovery is the protein-ligand binding pose prediction. To predict the most stable structure of the complex, the performance of conventional structure-based molecular docking methods heavily depends on the accuracy of scoring or energy functions (as an approximation of affinity) for each pose of the protein-ligand docking complex to effectively guide the search in an exponentially large solution space. However, due to the heterogeneity of molecular structures, the existing scoring calculation methods are either tailored to a particular data set or fail to exhibit high accuracy. In this paper, we propose a convolutional neural network (CNN)-based model that learns to predict the stability factor of the protein-ligand complex and exhibits the ability of CNNs to improve the existing docking software. Evaluated results on PDBbind data set indicate that our approach reduces the execution time of the traditional docking-based method while improving the accuracy. Our code, experiment scripts, and pretrained models are available at https://github.com/j9650/MedusaNet.

Assuntos

Redes Neurais de Computação , Proteínas , Ligantes , Simulação de Acoplamento Molecular , Ligação Proteica , Proteínas/metabolismo , Software

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA