Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 4 de 4
Filter
Add more filters










Language
Publication year range
1.
Preprint in English | bioRxiv | ID: ppbiorxiv-511571

ABSTRACT

We seek to transform how new and emergent variants of pandemiccausing viruses, specifically SARS-CoV-2, are identified and classified. By adapting large language models (LLMs) for genomic data, we build genome-scale language models (GenSLMs) which can learn the evolutionary landscape of SARS-CoV-2 genomes. By pretraining on over 110 million prokaryotic gene sequences and finetuning a SARS-CoV-2-specific model on 1.5 million genomes, we show that GenSLMs can accurately and rapidly identify variants of concern. Thus, to our knowledge, GenSLMs represents one of the first whole genome scale foundation models which can generalize to other prediction tasks. We demonstrate scaling of GenSLMs on GPU-based supercomputers and AI-hardware accelerators utilizing 1.63 Zettaflops in training runs with a sustained performance of 121 PFLOPS in mixed precision and peak of 850 PFLOPS. We present initial scientific insights from examining GenSLMs in tracking evolutionary dynamics of SARS-CoV-2, paving the path to realizing this on large biological data.

2.
Preprint in English | bioRxiv | ID: ppbiorxiv-463779

ABSTRACT

The severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) replication transcription complex (RTC) is a multi-domain protein responsible for replicating and transcribing the viral mRNA inside a human cell. Attacking RTC function with pharmaceutical compounds is a pathway to treating COVID-19. Conventional tools, e.g., cryo-electron microscopy and all-atom molecular dynamics (AAMD), do not provide sufficiently high resolution or timescale to capture important dynamics of this molecular machine. Consequently, we develop an innovative workflow that bridges the gap between these resolutions, using mesoscale fluctuating finite element analysis (FFEA) continuum simulations and a hierarchy of AI-methods that continually learn and infer features for maintaining consistency between AAMD and FFEA simulations. We leverage a multi-site distributed workflow manager to orchestrate AI, FFEA, and AAMD jobs, providing optimal resource utilization across HPC centers. Our study provides unprecedented access to study the SARS-CoV-2 RTC machinery, while providing general capability for AI-enabled multi-resolution simulations at scale.

3.
Preprint in English | medRxiv | ID: ppmedrxiv-21263783

ABSTRACT

The SARS-CoV-2 Delta variant (B.1.617.2) was initially identified in India in December 2020. Due to its high transmissibility, its prevalence in the U.S.A. grew from a near-zero baseline in early May 2021 to nearly 100% by late August 2021, according to CDC tracking. We accessed openly available data sources from the public health authorities of seven U.S. states, five U.S. counties, and the District of Columbia on RT-PCR COVID-19 tests split by the COVID-19 vaccination status of individuals tested during this period. Together, these time series enable estimation and tracking of COVID-19 vaccine effectiveness (VE*) (against RT-PCR diagnosed infection) concurrently with the growth of Delta variant prevalence in those locations. Our analyses reveal that in each locality the VE* for the combined set of all three US vaccines remained relatively stable and quite well-performing, despite the dramatic concurrent rise of Delta variant prevalence. We conclude that the Delta variant does not significantly evade vaccine-induced immunity. The variations in our measured VE* appear to be driven by demographic factors affecting the composition of the vaccinated cohorts, particularly as pertains to age distribution. We report that the measured VE*, aggregated across the collected sites, began at a value of about 0.9 in mid-May, declined to about 0.76 by mid-July, and recovered to about 0.9 by mid-September. SummaryWe estimated local COVID-19 vaccine effectiveness using RT-PCR COVID-19 test data broken out by vaccination status from select localities in the U.S.A. between 15 May and 15 September 2021 while the SARS-CoV-2 Delta variant (B.1.617.2) was ascending from essentially zero prevalence to total dominance of the genome, and showed that the rise of the Delta variant had negligible effect on vaccine effectiveness.

4.
Preprint in English | bioRxiv | ID: ppbiorxiv-437323

ABSTRACT

Despite the recent availability of vaccines against the acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the search for inhibitory therapeutic agents has assumed importance especially in the context of emerging new viral variants. In this paper, we describe the discovery of a novel non-covalent small-molecule inhibitor, MCULE-5948770040, that binds to and inhibits the SARS-Cov-2 main protease (Mpro) by employing a scalable high throughput virtual screening (HTVS) framework and a targeted compound library of over 6.5 million molecules that could be readily ordered and purchased. Our HTVS framework leverages the U.S. supercomputing infrastructure achieving nearly 91% resource utilization and nearly 126 million docking calculations per hour. Downstream biochemical assays validate this Mpro inhibitor with an inhibition constant (Ki) of 2.9 {micro}M [95% CI 2.2, 4.0]. Further, using room-temperature X-ray crystallography, we show that MCULE-5948770040 binds to a cleft in the primary binding site of Mpro forming stable hydrogen bond and hydrophobic interactions. We then used multiple {micro}s-timescale molecular dynamics (MD) simulations, and machine learning (ML) techniques to elucidate how the bound ligand alters the conformational states accessed by Mpro, involving motions both proximal and distal to the binding site. Together, our results demonstrate how MCULE-5948770040 inhibits Mpro and offers a springboard for further therapeutic design. O_TEXTBOXSignificance StatementThe ongoing novel coronavirus pandemic (COVID-19) has prompted a global race towards finding effective therapeutics that can target the various viral proteins. Despite many virtual screening campaigns in development, the discovery of validated inhibitors for SARS-CoV-2 protein targets has been limited. We discover a novel inhibitor against the SARS-CoV-2 main protease. Our integrated platform applies downstream biochemical assays, X-ray crystallography, and atomistic simulations to obtain a comprehensive characterization of its inhibitory mechanism. Inhibiting Mpro can lead to significant biomedical advances in targeting SARS-CoV-2 treatment, as it plays a crucial role in viral replication. C_TEXTBOX

SELECTION OF CITATIONS
SEARCH DETAIL
...