Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
bioRxiv ; 2022 Nov 23.
Artigo em Inglês | MEDLINE | ID: mdl-36451881

RESUMO

We seek to transform how new and emergent variants of pandemic-causing viruses, specifically SARS-CoV-2, are identified and classified. By adapting large language models (LLMs) for genomic data, we build genome-scale language models (GenSLMs) which can learn the evolutionary landscape of SARS-CoV-2 genomes. By pre-training on over 110 million prokaryotic gene sequences and fine-tuning a SARS-CoV-2-specific model on 1.5 million genomes, we show that GenSLMs can accurately and rapidly identify variants of concern. Thus, to our knowledge, GenSLMs represents one of the first whole genome scale foundation models which can generalize to other prediction tasks. We demonstrate scaling of GenSLMs on GPU-based supercomputers and AI-hardware accelerators utilizing 1.63 Zettaflops in training runs with a sustained performance of 121 PFLOPS in mixed precision and peak of 850 PFLOPS. We present initial scientific insights from examining GenSLMs in tracking evolutionary dynamics of SARS-CoV-2, paving the path to realizing this on large biological data.

2.
Int J High Perform Comput Appl ; 36(5-6): 603-623, 2022 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-38464362

RESUMO

The severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) replication transcription complex (RTC) is a multi-domain protein responsible for replicating and transcribing the viral mRNA inside a human cell. Attacking RTC function with pharmaceutical compounds is a pathway to treating COVID-19. Conventional tools, e.g., cryo-electron microscopy and all-atom molecular dynamics (AAMD), do not provide sufficiently high resolution or timescale to capture important dynamics of this molecular machine. Consequently, we develop an innovative workflow that bridges the gap between these resolutions, using mesoscale fluctuating finite element analysis (FFEA) continuum simulations and a hierarchy of AI-methods that continually learn and infer features for maintaining consistency between AAMD and FFEA simulations. We leverage a multi-site distributed workflow manager to orchestrate AI, FFEA, and AAMD jobs, providing optimal resource utilization across HPC centers. Our study provides unprecedented access to study the SARS-CoV-2 RTC machinery, while providing general capability for AI-enabled multi-resolution simulations at scale.

3.
Sci Data ; 6(1): 307, 2019 12 05.
Artigo em Inglês | MEDLINE | ID: mdl-31804487

RESUMO

The ability to auto-generate databases of optical properties holds great prospects in data-driven materials discovery for optoelectronic applications. We present a cognate set of experimental and computational data that describes key features of optical absorption spectra. This includes an auto-generated database of 18,309 records of experimentally determined UV/vis absorption maxima, λmax, and associated extinction coefficients, ϵ, where present. This database was produced using the text-mining toolkit, ChemDataExtractor, on 402,034 scientific documents. High-throughput electronic-structure calculations using fast (simplified Tamm-Dancoff approach) and traditional (time-dependent) density functional theory were executed to predict λmax and oscillation strengths, f (related to ϵ) for a subset of validated compounds. Paired quantities of these computational and experimental data show strong correlations in λmax, f and ϵ, laying the path for reliable in silico calculations of additional optical properties. The total dataset of 8,488 unique compounds and a subset of 5,380 compounds with experimental and computational data, are available in MongoDB, CSV and JSON formats. These can be queried using Python, R, Java, and MATLAB, for data-driven optoelectronic materials discovery.

4.
Sci Adv ; 5(3): eaav1190, 2019 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-30915396

RESUMO

Computational studies aimed at understanding conformationally dependent electronic structure in soft materials require a combination of classical and quantum-mechanical simulations, for which the sampling of conformational space can be particularly demanding. Coarse-grained (CG) models provide a means of accessing relevant time scales, but CG configurations must be back-mapped into atomistic representations to perform quantum-chemical calculations, which is computationally intensive and inconsistent with the spatial resolution of the CG models. A machine learning approach, denoted as artificial neural network electronic coarse graining (ANN-ECG), is presented here in which the conformationally dependent electronic structure of a molecule is mapped directly to CG pseudo-atom configurations. By averaging over decimated degrees of freedom, ANN-ECG accelerates simulations by eliminating backmapping and repeated quantum-chemical calculations. The approach is accurate, consistent with the CG spatial resolution, and can be used to identify computationally optimal CG resolutions.

5.
ACS Cent Sci ; 3(5): 415-424, 2017 May 24.
Artigo em Inglês | MEDLINE | ID: mdl-28573203

RESUMO

Organic glass films formed by physical vapor deposition exhibit enhanced stability relative to those formed by conventional liquid cooling and aging techniques. Recently, experimental and computational evidence has emerged indicating that the average molecular orientation can be tuned by controlling the substrate temperature at which these "stable glasses" are grown. In this work, we present a comprehensive all-atom simulation study of ethylbenzene, a canonical stable-glass former, using a computational film formation procedure that closely mimics the vapor deposition process. Atomistic studies of experimentally formed vapor-deposited glasses have not been performed before, and this study therefore begins by verifying that the model and method utilized here reproduces key structural features observed experimentally. Having established agreement between several simulated and experimental macroscopic observables, simulations are used to examine the substrate temperature dependence of molecular orientation. The results indicate that ethylbenzene glasses are anisotropic, depending upon substrate temperature, and that this dependence can be understood from the orientation present at the surface of the equilibrium liquid. By treating ethylbenzene as a simple model for molecular semiconducting materials, a quantum-chemical analysis is then used to show that the vapor-deposited glasses exhibit decreased energetic disorder and increased magnitude of the mean-squared transfer integral relative to isotropic, liquid-cooled films, an effect that is attributed to the anisotropic ordering of the molecular film. These results suggest a novel structure-function simulation strategy capable of tuning the electronic properties of organic semiconducting glasses prior to experimental deposition, which could have considerable potential for organic electronic materials design.

6.
ACS Macro Lett ; 6(2): 155-160, 2017 Feb 21.
Artigo em Inglês | MEDLINE | ID: mdl-35632885

RESUMO

Coarse-grained molecular dynamics enhanced by free-energy sampling methods is used to examine the roles of solvophobicity and multivalent salts on polyelectrolyte brush collapse. Specifically, we demonstrate that while ostensibly similar, solvophobic collapsed brushes and multivalent-ion collapsed brushes exhibit distinct mechanistic and structural features. Notably, multivalent-induced heterogeneous brush collapse is observed under good solvent polymer backbone conditions, demonstrating that the mechanism of multivalent collapse is not contingent upon a solvophobic backbone. Umbrella sampling of the potential of mean-force (PMF) between two individual brush strands confirms this analysis, revealing starkly different PMFs under solvophobic and multivalent conditions, suggesting the role of multivalent "bridging" as the discriminating feature in trivalent collapse. Structurally, multivalent ions show a propensity for nucleating order within collapsed brushes, whereas poor-solvent collapsed brushes are more disordered; this difference is traced to the existence of a metastable PMF minimum for poor solvent conditions, and a global PMF minimum for trivalent systems, under experimentally relevant conditions.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...