RESUMO
Extreme gradient boosting (XGBoost) is an artificial intelligence algorithm capable of high accuracy and low inference time. The current study applies this XGBoost to the production of platinum nano-film coating through atomic layer deposition (ALD). In order to generate a database for model development, platinum is coated on α-Al2O3 using a rotary-type ALD equipment. The process is controlled by four parameters: process temperature, stop valve time, precursor pulse time, and reactant pulse time. A total of 625 samples according to different process conditions are obtained. The ALD coating index is used as the Al/Pt component ratio through ICP-AES analysis during postprocessing. The four process parameters serve as the input data and produces the Al/Pt component ratio as the output data. The postprocessed data set is randomly divided into 500 training samples and 125 test samples. XGBoost demonstrates 99.9% accuracy and a coefficient of determination of 0.99. The inference time is lower than that of random forest regression, in addition to a higher prediction safety than that of the light gradient boosting machine.
RESUMO
Polypharmacy, the co-administration of multiple drugs, has become an area of concern as the elderly population grows and an unexpected infection, such as COVID-19 pandemic, keeps emerging. However, it is very costly and time-consuming to experimentally examine the pharmacological effects of polypharmacy. To address this challenge, machine learning models that predict drug-drug interactions (DDIs) have actively been developed in recent years. In particular, the growing volume of drug datasets and the advances in machine learning have facilitated the model development. In this regard, this review discusses the DDI-predicting machine learning models that have been developed since 2018. Our discussion focuses on dataset sources used to develop the models, featurization approaches of molecular structures and biological information, and types of DDI prediction outcomes from the models. Finally, we make suggestions for research opportunities in this field.
RESUMO
Covering: 2016 to 2021Discovery of novel natural products has been greatly facilitated by advances in genome sequencing, genome mining and analytical techniques. As a result, the volume of data for natural products has increased over the years, which started to serve as ingredients for developing machine learning models. In the past few years, a number of machine learning models have been developed to examine various aspects of a molecule by effectively processing its molecular structure. Understanding of the biological effects of natural products can benefit from such machine learning approaches. In this context, this Highlight reviews recent studies on machine learning models developed to infer various biological effects of molecules. A particular attention is paid to molecular featurization, or computational representation of a molecular structure, which is an essential process during the development of a machine learning model. Technical challenges associated with the use of machine learning for natural products are further discussed.
Assuntos
Produtos Biológicos/química , Produtos Biológicos/farmacologia , Aprendizado de Máquina , Interações Medicamentosas , Estrutura MolecularRESUMO
Background: The most common type of dementia, Alzheimer's disease (AD), is marked by the formation of extracellular amyloid beta (Aß) plaques. The impairments of axons and synapses appear in the process of Aß plaques formation, and this damage could cause neurodegeneration. We previously reported that non-saponin fraction with rich polysaccharide (NFP) from Korean Red Ginseng (KRG) showed neuroprotective effects in AD. However, precise molecular mechanism of the therapeutic effects of NFP from KRG in AD still remains elusive. Methods: To investigate the therapeutic mechanisms of NFP from KRG on AD, we conducted proteomic analysis for frontal cortex from vehicle-treated wild-type, vehicle-treated 5XFAD mice, and NFP-treated 5XFAD mice by using nano-LC-ESI-MS/MS. Metabolic network analysis was additionally performed as the effects of NFP appeared to be associated with metabolism according to the proteome analysis. Results: Starting from 5,470 proteins, 2,636 proteins were selected for hierarchical clustering analysis, and finally 111 proteins were further selected for protein-protein interaction network analysis. A series of these analyses revealed that proteins associated with synapse and mitochondria might be linked to the therapeutic mechanism of NFP. Subsequent metabolic network analysis via genome-scale metabolic models that represent the three mouse groups showed that there were significant changes in metabolic fluxes of mitochondrial carnitine shuttle pathway and mitochondrial beta-oxidation of polyunsaturated fatty acids. Conclusion: Our results suggested that the therapeutic effects of NFP on AD were associated with synaptic- and mitochondrial-related pathways, and they provided targets for further rigorous studies on precise understanding of the molecular mechanism of NFP.
RESUMO
Computational analysis of biological data is becoming increasingly important, especially in this era of big data. Computational analysis of biological data allows efficiently deriving biological insights for given data, and sometimes even counterintuitive ones that may challenge the existing knowledge. Among experimental researchers without any prior exposure to computer programming, computational analysis of biological data has often been considered to be a task reserved for computational biologists. However, thanks to the increasing availability of user-friendly computational resources, experimental researchers can now easily access computational resources, including a scientific computing environment and packages necessary for data analysis. In this regard, we here describe the process of accessing Jupyter Notebook, the most popular Python coding environment, to conduct computational biology. Python is currently a mainstream programming language for biology and biotechnology. In particular, Anaconda and Google Colaboratory are introduced as two representative options to easily launch Jupyter Notebook. Finally, a Python package COBRApy is demonstrated as an example to simulate 1) specific growth rate of Escherichia coli as well as compounds consumed or generated under a minimal medium with glucose as a sole carbon source, and 2) theoretical production yield of succinic acid, an industrially important chemical, using E. coli. This protocol should serve as a guide for further extended computational analyses of biological data for experimental researchers without computational background.