Your browser doesn't support javascript.
loading
Current strategies to address data scarcity in artificial intelligence-based drug discovery: A comprehensive review.
Gangwal, Amit; Ansari, Azim; Ahmad, Iqrar; Azad, Abul Kalam; Wan Sulaiman, Wan Mohd Azizi.
Afiliação
  • Gangwal A; Department of Natural Product Chemistry, Shri Vile Parle Kelavani Mandal's Institute of Pharmacy, Dhule, 424001, Maharashtra, India. Electronic address: gangwal.amit@gmail.com.
  • Ansari A; Computer Aided Drug Design Center, Shri Vile Parle Kelavani Mandal's Institute of Pharmacy, Dhule, 424001, Maharashtra, India.
  • Ahmad I; Department of Pharmaceutical Chemistry, Prof. Ravindra Nikam College of Pharmacy, Gondur, Dhule, 424002, Maharashtra, India. Electronic address: ansariiqrar50@gmail.com.
  • Azad AK; Faculty of Pharmacy, University College of MAIWP International, Batu Caves, 68100, Kuala Lumpur, Malaysia. Electronic address: azad2011iium@gmail.com.
  • Wan Sulaiman WMA; Faculty of Pharmacy, University College of MAIWP International, Batu Caves, 68100, Kuala Lumpur, Malaysia. Electronic address: drwanazizi@ucmi.edu.my.
Comput Biol Med ; 179: 108734, 2024 Sep.
Article em En | MEDLINE | ID: mdl-38964243
ABSTRACT
Artificial intelligence (AI) has played a vital role in computer-aided drug design (CADD). This development has been further accelerated with the increasing use of machine learning (ML), mainly deep learning (DL), and computing hardware and software advancements. As a result, initial doubts about the application of AI in drug discovery have been dispelled, leading to significant benefits in medicinal chemistry. At the same time, it is crucial to recognize that AI is still in its infancy and faces a few limitations that need to be addressed to harness its full potential in drug discovery. Some notable limitations are insufficient, unlabeled, and non-uniform data, the resemblance of some AI-generated molecules with existing molecules, unavailability of inadequate benchmarks, intellectual property rights (IPRs) related hurdles in data sharing, poor understanding of biology, focus on proxy data and ligands, lack of holistic methods to represent input (molecular structures) to prevent pre-processing of input molecules (feature engineering), etc. The major component in AI infrastructure is input data, as most of the successes of AI-driven efforts to improve drug discovery depend on the quality and quantity of data, used to train and test AI algorithms, besides a few other factors. Additionally, data-gulping DL approaches, without sufficient data, may collapse to live up to their promise. Current literature suggests a few methods, to certain extent, effectively handle low data for better output from the AI models in the context of drug discovery. These are transferring learning (TL), active learning (AL), single or one-shot learning (OSL), multi-task learning (MTL), data augmentation (DA), data synthesis (DS), etc. One different method, which enables sharing of proprietary data on a common platform (without compromising data privacy) to train ML model, is federated learning (FL). In this review, we compare and discuss these methods, their recent applications, and limitations while modeling small molecule data to get the improved output of AI methods in drug discovery. Article also sums up some other novel methods to handle inadequate data.
Assuntos
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Inteligência Artificial / Descoberta de Drogas Limite: Humans Idioma: En Revista: Comput Biol Med / Comput. biol. med / Computers in biology and medicine Ano de publicação: 2024 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Inteligência Artificial / Descoberta de Drogas Limite: Humans Idioma: En Revista: Comput Biol Med / Comput. biol. med / Computers in biology and medicine Ano de publicação: 2024 Tipo de documento: Article