RESUMO
Despite rapid progress in the field of metal-organic frameworks (MOFs), the potential of using machine learning (ML) methods to predict MOF synthesis parameters is still untapped. Here, we show how ML can be used for rationalization and acceleration of the MOF discovery process by directly predicting the synthesis conditions of a MOF based on its crystal structure. Our approach is based on: i)â establishing the first MOF synthesis database via automatic extraction of synthesis parameters from the literature, ii)â training and optimizing ML models by employing the MOF database, and iii)â predicting the synthesis conditions for new MOF structures. The ML models, even at an initial stage, exhibit a good prediction performance, outperforming human expert predictions, obtained through a synthesis survey. The automated synthesis prediction is available via a web-tool on https://mof-synthesis.aimat.science.
Assuntos
Estruturas Metalorgânicas , Mineração de Dados , Humanos , Aprendizado de Máquina , Estruturas Metalorgânicas/químicaRESUMO
The adsorption and desorption of nucleic acid to a solid surface is ubiquitous in various research areas like pharmaceutics, nanotechnology, molecular biology, and molecular electronics. In spite of this widespread importance, it is still not well understood how the negatively charged deoxyribonucleic acid (DNA) binds to the negatively charged silica surface in an aqueous solution. In this article, we study the adsorption of DNA to the silica surface using both modeling and experiments and shed light on the complicated binding (DNA to silica) process. The binding agent mediated DNA adsorption was elegantly captured by cooperative Langmuir model. Bulk-depletion experiments were performed to conclude the necessity of a positively charged binding agent for efficient DNA binding, which complements the findings from the model. A profound understanding of DNA binding will help to tune various processes for efficient nucleic acid extraction and purification. However, this work goes beyond the DNA binding and can shed light on other binding agent mediated surface-surface, surface-molecule, molecule-molecule interaction.
Assuntos
Dióxido de Silício , Água , Adsorção , DNA , Propriedades de SuperfícieRESUMO
Prediction of pair potential given a typical configuration of an interacting classical system is a difficult inverse problem. There exists no exact result that can predict the potential given the structural information. We demonstrate that using machine learning (ML) one can get a quick but accurate answer to the question: "which pair potential lead to the given structure (represented by pair correlation function)?" We use artificial neural network (NN) to address this question and show that this ML technique is capable of providing very accurate prediction of pair potential irrespective of whether the system is in a crystalline, liquid or gas phase. We show that the trained network works well for sample system configurations taken from both equilibrium and out of equilibrium simulations (active matter systems) when the later is mapped to an effective equilibrium system with a modified potential. We show that the ML prediction about the effective interaction for the active system is not only useful to make prediction about the MIPS (motility induced phase separation) phase but also identifies the transition towards this state.
RESUMO
Double-stranded DNA (dsDNA) has been established as an efficient medium for charge migration, bringing it to the forefront of the field of molecular electronics and biological research. The charge migration rate is controlled by the electronic couplings between the two nucleobases of DNA/RNA. These electronic couplings strongly depend on the intermolecular geometry and orientation. Estimating these electronic couplings for all the possible relative geometries of molecules using the computationally demanding first-principles calculations requires a lot of time and computational resources. In this article, we present a machine learning (ML)-based model to calculate the electronic coupling between any two bases of dsDNA/dsRNA and bypass the computationally expensive first-principles calculations. Using the Coulomb matrix representation which encodes the atomic identities and coordinates of the DNA base pairs to prepare the input dataset, we train a feedforward neural network model. Our neural network (NN) model can predict the electronic couplings between dsDNA base pairs with any structural orientation with a mean absolute error (MAE) of less than 0.014 eV. We further use the NN-predicted electronic coupling values to compute the dsDNA/dsRNA conductance.
Assuntos
DNA , Redes Neurais de Computação , Pareamento de Bases , Eletrônica , Aprendizado de MáquinaRESUMO
Protein-surface interactions are exploited in various processes in life sciences and biotechnology. Many of such processes are performed in presence of a buffer system, which is generally believed to have an influence on the protein-surface interaction but is rarely investigated systematically. Combining experimental and theoretical methodologies, we herein demonstrate the strong influence of the buffer type on protein-surface interactions. Using state of the art chromatographic experiments, we measure the interaction between individual amino acids and silica, as a reference to understand protein-surface interactions. Among all the 20 proteinogenic amino acids studied, we found that arginine (R) and lysine (K) bind most strongly to silica, a finding validated by free energy calculations. We further measured the binding of R and K at different pH in presence of two different buffers, MOPS (3-(N-morpholino)propanesulfonic acid) and TRIS (tris(hydroxymethyl)aminomethane), and find dramatically different behavior. In presence of TRIS, the binding affinity of R/K increases with pH, whereas we observe an opposite trend for MOPS. These results can be understood using a multiscale modelling framework combining molecular dynamics simulation and Langmuir adsorption model. The modelling approach helps to optimize buffer conditions in various fields like biosensors, drug delivery or bio separation engineering prior to the experiment.
RESUMO
Charge transport in deoxyribonucleic acid (DNA) is of immense interest in biology and molecular electronics. Electronic coupling between the DNA bases is an important parameter describing the efficiency of charge transport in DNA. A reasonable estimation of this electronic coupling requires many expensive first principle calculations. In this article, we present a machine learning (ML) based model to calculate the electronic coupling between the guanine bases of the DNA (in the same strand) of any length, thus avoiding expensive first-principle calculations. The electronic coupling between the bases are evaluated using density functional theory (DFT) calculations with the morphologies derived from fully atomistic molecular dynamics (MD) simulations. A new and simple protocol based on the coarse-grained model of the DNA has been used to extract the feature vectors for the DNA bases. A deep neural network (NN) is trained with the feature vector as input and the DFT-calculated electronic coupling as output. Once well trained, the NN can predict the DFT-calculated electronic coupling of new structures with a mean absolute error (MAE) of 0.02 eV.
Assuntos
DNA/química , Guanina/química , Aprendizado de Máquina , Teoria da Densidade Funcional , Eletrônica , Simulação de Dinâmica MolecularRESUMO
Inherent molecular fluctuations are known to have a significant influence on the charge transport properties of biomolecules like DNA, PNA and proteins. In this work, we show ways to control these fluctuations and further demonstrate their use to enhance the conductance of two widely studied molecular wires, namely dsDNA (DNA) and G4 Quadruplex (G4-Quad). We quantify the molecular fluctuation in terms of the root mean square deviation (RMSD) of the molecule. In the case of DNA, we use temperature to control the fluctuations, while in the case of G4-Quad the fluctuations are tuned by the ions inside the pore. The electronic coupling between the bases of dsDNA and G4-Quad, which measures the conductance of these molecular wires, shows a non-monotonic behaviour with the increase in fluctuation. We find values of fluctuation which give rise to maximum electronic coupling and hence high conductivity for both the cases. In the case of DNA, these optimal fluctuations (â¼2.5 Å) are achieved at a temperature of 210 K, which gives rise to an electronic coupling of 0.135 eV between the DNA bases. The optimal fluctuations in G4-Quad are achieved (â¼7 Å) in a 4 base pair long system with 2 Na+ ions inside the pore, giving rise to an electronic coupling of 0.09 eV.
Assuntos
DNA/química , Condutividade Elétrica , Quadruplex G , Simulação de Dinâmica Molecular , Ácidos Nucleicos Peptídicos/química , Proteínas/química , Teoria Quântica , TemperaturaRESUMO
In spite of the striking difference between twist-stretch coupling of dsRNA and dsDNA under external force, dsRNA shows similar structural polymorphism to dsDNA under different pulling protocols. Our atomistic MD simulations show that overstretching dsRNA along the 3' direction of the opposite strands (OS3) leads to the emergence of S-RNA whereas overstretching along the 5' directions of the opposite strands (OS5) leads to melting of dsRNA at lower forces. Using the dsRNA morphology from pulling MD simulations, we use a multiscale method involving ab initio calculations and Kinetic Monte Carlo (KMC) simulations to estimate the conductance of dsRNA and find that the conformational changes drastically affect its conductance. The current through dsRNA chains drastically drops after a critical stretching length and critically depends on the pulling protocol. The critical stretching length for the OS3 pulling case is around 65% higher than that of the OS5 case.
Assuntos
DNA/química , RNA de Cadeia Dupla/química , Pareamento de Bases , Técnicas Eletroquímicas/métodos , Eletrodos , Ligação de Hidrogênio , Fenômenos Mecânicos , Simulação de Dinâmica Molecular , Método de Monte Carlo , Conformação de Ácido NucleicoRESUMO
We report a structural polymorphism of the S-DNA when a canonical B-DNA is stretched under different pulling protocols and provide a fundamental molecular understanding of the DNA stretching mechanism. Extensive all atom molecular dynamics simulations reveal a clear formation of S-DNA when the B-DNA is stretched along the 3' directions of the opposite strands (OS3) and is characterized by the changes in the number of H-bonds, entropy, and free energy. Stretching along the 5' directions of the opposite strands (OS5) leads to force induced melting form of the DNA. Interestingly, stretching along the opposite ends of the same strand leads to a coexistence of both the S- and melted M-DNA structures. We also do the structural characterization of the S-DNA by calculating various helical parameters. We find that the S-DNA has a twist of â¼10° which corresponds to a helical repeat length of â¼36 base pairs in close agreement with the previous experimental results. Moreover, we find that the free energy barrier between the canonical and overstretched states of DNA is higher for the same termini pulling protocol in comparison to all other protocols considered in this work. Overall, our observations not only reconcile with the available experimental results qualitatively but also enhance the understanding of different overstretched DNA structures.
Assuntos
DNA de Forma B/química , Cristalização , Simulação de Dinâmica Molecular , Conformação de Ácido NucleicoRESUMO
Using atomistic molecular dynamics simulation, we study the discotic columnar liquid crystalline (LC) phases formed by a new organic compound having hexa-peri-Hexabenzocoronene (HBC) core with six pendant oligothiophene units recently synthesized by Nan Hu et al. [Adv. Mater. 26, 2066 (2014)]. This HBC core based LC phase was shown to have electric field responsive behavior and has important applications in organic electronics. Our simulation results confirm the hexagonal arrangement of columnar LC phase with a lattice spacing consistent with that obtained from small angle X-ray diffraction data. We have also calculated various positional and orientational correlation functions to characterize the ordering of the molecules in the columnar arrangement. The molecules in a column are arranged with an average twist of 25° having an average inter-molecular separation of â¼5 Å. Interestingly, we find an overall tilt angle of 43° between the columnar axis and HBC core. We also simulate the charge transport through this columnar phase and report the numerical value of charge carrier mobility for this liquid crystal phase. The charge carrier mobility is strongly influenced by the twist angle and average spacing of the molecules in the column.
RESUMO
Balancing accuracy and efficiency is a common problem in molecular simulation. This tradeoff is evident in coarse-grained molecular dynamics simulation, which prioritizes efficiency, and all-atom molecular simulation, which prioritizes accuracy. Despite continuous efforts, creating a coarse-grained model that accurately captures both the system's structure and dynamics remains elusive. In this article, we present a data-driven approach for constructing coarse-grained models that aim to describe both the structure and dynamics of the system equally well. While the development of machine learning models is well-received in the scientific community, the significance of dataset creation for these models is often overlooked. However, data-driven approaches cannot progress without a robust dataset. To address this, we construct a database of synthetic coarse-grained potentials generated from unphysical all-atom models. A neural network is trained with the generated database to predict the coarse-grained potentials of real liquids. We evaluate their quality by calculating the combined loss of structural and dynamical accuracy upon coarse-graining. When we compare our machine learning-based coarse-grained potential with the one from iterative Boltzmann inversion, the machine learning prediction turns out better for all eight hydrocarbon liquids we studied. As all-atom surfaces turn more nonspherical, both ways of coarse-graining degrade. Still, the neural network outperforms iterative Boltzmann inversion in constructing good quality coarse-grained models for such cases. The synthetic database and the developed machine learning models are freely available to the community, and we believe that our approach will generate interest in efficiently deriving accurate coarse-grained models for liquids.
RESUMO
Coarse-grained molecular dynamics (MD) simulation is a promising alternative to all-atom MD simulation for the fast calculation of system properties, which is imperative in designing materials with a specific target property. There have been several coarse-graining strategies developed over the past few years that provide accurate structural properties of the system. However, these coarse-grained models share a major drawback in that they introduce an artificial acceleration in molecular mobility. In this paper, we report a data-driven approach to generate coarse-grained models that preserve the all-atom molecular mobility. We designed a machine learning model in the form of an artificial neural network, which directly predicts the simulation-ready mobility-preserving coarse-grained potential as an output given the all-atom force field (FF) parameters as inputs. As a proof of principle, we took 2,3,4-trimethylpentane as a model system and described the development of machine learning models in detail. We quantify the artificial acceleration in molecular mobility by defining the acceleration factor as the ratio of the coarse-grained and the all-atom diffusion coefficient. The predicted coarse-grained potential generated by the best machine learning model can bring down the acceleration factor to a value of â¼2, which could be otherwise as large as 7 for a typical value of 3 × 10-9 m2 s-1 for the all-atom diffusion coefficient. We believe our method will be of interest in the community as a route to generating coarse-grained potentials with accurate dynamics.
Assuntos
Modelos Biológicos , Simulação de Dinâmica MolecularRESUMO
The understanding of interactions between proteins with silica surface is crucial for a wide range of different applications: from medical devices, drug delivery and bioelectronics to biotechnology and downstream processing. We show the application of EISM (Effective Implicit Surface Model) for discovering the set of peptide interactions with silica surface. The EISM is employed for a high-speed computational screening of peptides to model the binding affinity of small peptides to silica surfaces. The simulations are complemented with experimental data of peptides with silica nanoparticles from microscale thermophoresis and from infrared spectroscopy. The experimental work shows excellent agreement with computational results and verifies the EISM model for the prediction of peptide-surface interactions. 57 peptides, with amino acids favorable for adsorption on Silica surface, are screened by EISM model for obtaining results, which are worth to be considered as a guidance for future experimental and theoretical works. This model can be used as a broad platform for multiple challenges at surfaces which can be applied for multiple surfaces and biomolecules beyond silica and peptides.
Assuntos
Peptídeos , Dióxido de Silício , Adsorção , Aminoácidos , Simulação por Computador , Peptídeos/química , Dióxido de Silício/química , Propriedades de SuperfícieRESUMO
The interaction of proteins and peptides with inorganic surfaces is relevant in a wide array of technological applications. A rational approach to design peptides for specific surfaces would build on amino-acid and surface specific interaction models, which are difficult to characterize experimentally or by modeling. Even with such a model at hand, the large number of possible sequences and the large conformation space of peptides make comparative simulations challenging. Here we present a computational protocol, the effective implicit surface model (EISM), for efficient in silico evaluation of the binding affinity trends of peptides on parameterized surface, with a specific application to the widely studied gold surface. In EISM the peptide surface interactions are modeled with an amino-acid and surface specific implicit solvent model, which permits rapid exploration of the peptide conformational degrees of freedom. We demonstrate the parametrization of the model and compare the results with all-atom simulations and experimental results for specific peptides.
Assuntos
Ouro , Peptídeos , Adsorção , Proteínas , Solventes , Propriedades de SuperfícieRESUMO
Adsorption and desorption of molecules are key processes in extraction and purification of biomolecules, engineering of drug carriers, and designing of surface-specific coatings. To understand the adsorption process on the atomic scale, state-of-the-art quantum mechanical and classical simulation methodologies are widely used. However, studying adsorption using a full quantum mechanical treatment is limited to picoseconds simulation timescales, while classical molecular dynamics simulations are limited by the accuracy of the existing force fields. To overcome these challenges, we propose a systematic way to generate flexible, application-specific highly accurate force fields by training artificial neural networks. As a proof of concept, we study the adsorption of the amino acid alanine on graphene and gold (111) surfaces and demonstrate the force field generation methodology in detail. We find that a molecule-specific force field with Lennard-Jones type two-body terms incorporating the 3rd and 7th power of the inverse distances between the atoms of the adsorbent and the surfaces yields optimal results, which is surprisingly different from typical Lennard-Jones potentials used in traditional force fields. Furthermore, we present an efficient and easy-to-train machine learning model that incorporates system-specific three-body (or higher order) interactions that are required, for example, for gold surfaces. Our final machine learning-based force field yields a mean absolute error of less than 4.2 kJ/mol at a speed-up of â¼105 times compared to quantum mechanical calculation, which will have a significant impact on the study of adsorption in different research areas.
RESUMO
In this work we study the structure-transport property relationships of small ligand intercalated DNA molecules using a multiscale modeling approach where extensive ab initio calculations are performed on numerous MD-simulated configurations of dsDNA and dsDNA intercalated with two different intercalators, ethidium and daunomycin. DNA conductance is found to increase by one order of magnitude upon drug intercalation due to the local unwinding of the DNA base pairs adjacent to the intercalated sites, which leads to modifications of the density of states in the near-Fermi-energy region of the ligand-DNA complex. Our study suggests that the intercalators can be used to enhance or tune the DNA conductance, which opens new possibilities for their potential applications in nanoelectronics.
Assuntos
DNA/química , Modelos Moleculares , Conformação de Ácido NucleicoRESUMO
Owing to their high specific surface and low production cost, carbon materials are among the most important adsorption materials. Novel usages, for instance in pharmaceutical applications, challenge existing methods because charged and strongly polar substances need to be adsorbed. Here, we systematically investigate the highly complex adsorption equilibria of organic molecules having multiple protonation states as a function of pH. The adsorption behavior depends on intermolecular interactions within the solution (dissociation equilibria) and between adsorbed molecules on the carbon surface (electrostatic forces). For the model substances maleic acid and phenylalanine, we demonstrate that a custom-made genetic algorithm is able to extract up to nine parameters of a multispecies isotherm from experimental data covering a broad pH-range. The parameters, including adsorption affinities, interaction energies, and maximum loadings were also predicted by molecular dynamics simulations. Both approaches obtained a good qualitative and mostly also quantitative description of the adsorption behavior within a pH-range of 2-12. By combining the determined isotherms with mass balances, the final concentrations and pH-shifts of batch adsorption experiments can be predicted. The developed modeling tools can be easily adapted to other types of pH-dependent, multispecies adsorbates and therefore will help to optimize adsorption-based processes in different fields.
RESUMO
Interactions of biomolecules with inorganic oxide surfaces such as silica in aqueous solutions are of profound interest in various research fields, including chemistry, biotechnology, and medicine. While there is a general understanding of the dominating electrostatic interactions, the binding mechanism is still not fully understood. Here, chromatographic zonal elution and flow microcalorimetry experiments were combined with molecular dynamic simulations to describe the interaction of different capped amino acids with the silica surface. We demonstrate that ion pairing is the dominant electrostatic interaction. Surprisingly, the interaction strength is more dependent on the repulsive carboxy group than on the attracting amino group. These findings are essential for conducting experimental and simulative studies on amino acids when transferring the results to biomolecule-surface interactions.
Assuntos
Alanina/química , Arginina/química , Dióxido de Silício/química , Alanina/metabolismo , Arginina/metabolismo , Calorimetria , Simulação de Dinâmica Molecular , Dióxido de Silício/metabolismo , Eletricidade Estática , Propriedades de SuperfícieRESUMO
In this study, we compare the charge transport properties of multiple double-stranded (ds)RNA sequences with corresponding dsDNA sequences. Recent studies have presented a contradictory picture of relative charge transport efficiencies in A-form DNA : RNA hybrids and dsDNA. Using a multiscale modelling framework, we compute conductance of dsDNA and dsRNA using Landauer formalism in the coherent limit and Marcus-Hush theory in the incoherent limit. We find that dsDNA conducts better than dsRNA in both the charge transport regimes. Our analysis shows that the structural differences in the twist angle and slide of dsDNA and dsRNA are the main reasons behind the higher conductance of dsDNA in the incoherent hopping regime. In the coherent limit however, for the same base pair length, the conductance of dsRNA is higher than that of dsDNA for the morphologies where dsRNA has a smaller end-to-end length relative to that of dsDNA.
Assuntos
DNA , RNA de Cadeia Dupla , Pareamento de BasesRESUMO
Recent progress in the improvement of organic solar cells lead to a power conversion efficiency to over 16%. One of the key factors for this improvement is a more favorable energy level alignment between donor and acceptor materials, which demonstrates that the properties of interfaces between donor and acceptor regions are of paramount importance. Recent investigations showed a significant dependence of the energy levels of organic semiconductors upon admixture of different materials, but its origin is presently not well understood. Here, we use multiscale simulation protocols to investigate the molecular origin of the mixing induced energy level shifts and show that electrostatic properties, in particular higher-order multipole moments and polarizability determine the strength of the effect. The findings of this study may guide future material-design efforts in order to improve device performance by systematic modification of molecular properties.