RESUMO
BACKGROUND: Identifying individuals with a higher risk of developing severe coronavirus disease 2019 (COVID-19) outcomes will inform targeted and more intensive clinical monitoring and management. To date, there is mixed evidence regarding the impact of preexisting autoimmune disease (AID) diagnosis and/or immunosuppressant (IS) exposure on developing severe COVID-19 outcomes. METHODS: A retrospective cohort of adults diagnosed with COVID-19 was created in the National COVID Cohort Collaborative enclave. Two outcomes, life-threatening disease and hospitalization, were evaluated by using logistic regression models with and without adjustment for demographics and comorbidities. RESULTS: Of the 2 453 799 adults diagnosed with COVID-19, 191 520 (7.81%) had a preexisting AID diagnosis and 278 095 (11.33%) had a preexisting IS exposure. Logistic regression models adjusted for demographics and comorbidities demonstrated that individuals with a preexisting AID (odds ratio [OR], 1.13; 95% confidence interval [CI]: 1.09-1.17; P < .001), IS exposure (OR, 1.27; 95% CI: 1.24-1.30; P < .001), or both (OR, 1.35; 95% CI: 1.29-1.40; P < .001) were more likely to have a life-threatening disease. These results were consistent when hospitalization was evaluated. A sensitivity analysis evaluating specific IS revealed that tumor necrosis factor inhibitors were protective against life-threatening disease (OR, 0.80; 95% CI: .66-.96; P = .017) and hospitalization (OR, 0.80; 95% CI: .73-.89; P < .001). CONCLUSIONS: Patients with preexisting AID, IS exposure, or both are more likely to have a life-threatening disease or hospitalization. These patients may thus require tailored monitoring and preventative measures to minimize negative consequences of COVID-19.
Assuntos
Autoimunidade , COVID-19 , Adulto , Humanos , COVID-19/epidemiologia , Estudos Retrospectivos , Hospitalização , Imunossupressores/uso terapêuticoRESUMO
A growing community is constructing a next-generation file format (NGFF) for bioimaging to overcome problems of scalability and heterogeneity. Organized by the Open Microscopy Environment (OME), individuals and institutes across diverse modalities facing these problems have designed a format specification process (OME-NGFF) to address these needs. This paper brings together a wide range of those community members to describe the cloud-optimized format itself-OME-Zarr-along with tools and data resources available today to increase FAIR access and remove barriers in the scientific process. The current momentum offers an opportunity to unify a key component of the bioimaging domain-the file format that underlies so many personal, institutional, and global data management and analysis tasks.
Assuntos
Microscopia , Software , Humanos , Apoio ComunitárioRESUMO
BACKGROUND: Multi-institution electronic health records (EHR) are a rich source of real world data (RWD) for generating real world evidence (RWE) regarding the utilization, benefits and harms of medical interventions. They provide access to clinical data from large pooled patient populations in addition to laboratory measurements unavailable in insurance claims-based data. However, secondary use of these data for research requires specialized knowledge and careful evaluation of data quality and completeness. We discuss data quality assessments undertaken during the conduct of prep-to-research, focusing on the investigation of treatment safety and effectiveness. METHODS: Using the National COVID Cohort Collaborative (N3C) enclave, we defined a patient population using criteria typical in non-interventional inpatient drug effectiveness studies. We present the challenges encountered when constructing this dataset, beginning with an examination of data quality across data partners. We then discuss the methods and best practices used to operationalize several important study elements: exposure to treatment, baseline health comorbidities, and key outcomes of interest. RESULTS: We share our experiences and lessons learned when working with heterogeneous EHR data from over 65 healthcare institutions and 4 common data models. We discuss six key areas of data variability and quality. (1) The specific EHR data elements captured from a site can vary depending on source data model and practice. (2) Data missingness remains a significant issue. (3) Drug exposures can be recorded at different levels and may not contain route of administration or dosage information. (4) Reconstruction of continuous drug exposure intervals may not always be possible. (5) EHR discontinuity is a major concern for capturing history of prior treatment and comorbidities. Lastly, (6) access to EHR data alone limits the potential outcomes which can be used in studies. CONCLUSIONS: The creation of large scale centralized multi-site EHR databases such as N3C enables a wide range of research aimed at better understanding treatments and health impacts of many conditions including COVID-19. As with all observational research, it is important that research teams engage with appropriate domain experts to understand the data in order to define research questions that are both clinically important and feasible to address using these real world data.
Assuntos
COVID-19 , Humanos , Confiabilidade dos Dados , Tratamento Farmacológico da COVID-19 , Coleta de DadosRESUMO
Bent-shaped liquid crystals have attracted significant attention recently due to their novel mesostructure and the intriguing behavior of their elastic constants, which are strongly anisotropic and have an unusual temperature dependence. Though theories explain the onset of the twist-bend nematic phase (NTB) through spontaneous symmetry breaking concomitant with transition to a negative bend (K3) elastic constant, this has not been observed as yet in experiments. There, the small bend elastic constant has a strongly non-monotonic temperature dependence, which first increases after crossing the isotropic (I)-nematic (N) transition, then dips near the nematic (N)-twist-bend (NTB) transition before it increases again as the transition is crossed. The molecular mechanisms responsible for this exotic behavior are unclear. Here, we utilize density of states algorithms in Monte Carlo simulation applied to a variant of the Lebwohl-Lasher model which includes bent-shaped-like interactions to analyze the mechanism behind elastic response in this novel mesostructure and understand the temperature dependence of its Frank-Oseen elastic constants.
RESUMO
The success of enhanced sampling molecular simulations that accelerate along collective variables (CVs) is predicated on the availability of variables coincident with the slow collective motions governing the long-time conformational dynamics of a system. It is challenging to intuit these slow CVs for all but the simplest molecular systems, and their data-driven discovery directly from molecular simulation trajectories has been a central focus of the molecular simulation community to both unveil the important physical mechanisms and drive enhanced sampling. In this work, we introduce state-free reversible VAMPnets (SRV) as a deep learning architecture that learns nonlinear CV approximants to the leading slow eigenfunctions of the spectral decomposition of the transfer operator that evolves equilibrium-scaled probability distributions through time. Orthogonality of the learned CVs is naturally imposed within network training without added regularization. The CVs are inherently explicit and differentiable functions of the input coordinates making them well-suited to use in enhanced sampling calculations. We demonstrate the utility of SRVs in capturing parsimonious nonlinear representations of complex system dynamics in applications to 1D and 2D toy systems where the true eigenfunctions are exactly calculable and to molecular dynamics simulations of alanine dipeptide and the WW domain protein.
RESUMO
Polyelectrolytes may be classified into two primary categories (strong and weak) depending on how their charge state responds to the local environment. Both of these find use in many applications, including drug delivery, gene therapy, layer-by-layer films, and fabrication of ion filtration membranes. The mechanism of polyelectrolyte complexation is, however, still not completely understood, though experimental investigations suggest that entropy gain due to release of counterions is the key driving force for strong polyelectrolyte complexation. Here we perform a comprehensive thermodynamic investigation through coarse-grained molecular simulations permitting us to calculate the free energy of complex formation. Importantly, our expanded-ensemble methods permit the explicit separation of energetic and entropic contributions to the free energy. Our investigations indicate that entropic contributions indeed dominate the free energy of complex formation for strong polyelectrolytes, but are less important than energetic contributions when weak electrostatic coupling or weak polyelectrolytes are present. Our results provide a new view of the free energy of polyelectrolyte complex formation driven by polymer association, which should also arise in systems with large charge spacings or bulky counterions, both of which act to weaken ion-polymer binding.
RESUMO
Experiments on confined droplets of the nematic liquid crystal 5CB have questioned long-established bounds imposed on the elastic free energy of nematic systems. This elasticity, which derives from molecular alignment within nematic systems, is quantified through a set of moduli which can be difficult to measure experimentally and, in some cases, can only be probed indirectly. This is particularly true of the surfacelike saddle-splay elastic term, for which the available experimental data indicate values on the cusp of stability, often with large uncertainties. Here, we demonstrate that all nematic elastic moduli, including the saddle-splay elastic constant k_{24}, may be calculated directly from atomistic molecular simulations. Importantly, results obtained through in silico measurements of the 5CB elastic properties demonstrate unambiguously that saddle-splay elasticity alone is unable to describe the observed confined morphologies.
RESUMO
Existing adaptive bias techniques, which seek to estimate free energies and physical properties from molecular simulations, are limited by their reliance on fixed kernels or basis sets which hinder their ability to efficiently conform to varied free energy landscapes. Further, user-specified parameters are in general non-intuitive yet significantly affect the convergence rate and accuracy of the free energy estimate. Here we propose a novel method, wherein artificial neural networks (ANNs) are used to develop an adaptive biasing potential which learns free energy landscapes. We demonstrate that this method is capable of rapidly adapting to complex free energy landscapes and is not prone to boundary or oscillation problems. The method is made robust to hyperparameters and overfitting through Bayesian regularization which penalizes network weights and auto-regulates the number of effective parameters in the network. ANN sampling represents a promising innovative approach which can resolve complex free energy landscapes in less time than conventional approaches while requiring minimal user input.
RESUMO
Weak polyelectrolytes are relevant for a wide range of fields; in particular, they have been investigated as "smart" materials for chemical separations and drug delivery. The charges on weak polyelectrolytes are dynamic, causing polymer chains to adopt different equilibrium conformations even with relatively small changes to the surrounding environment. Currently, there exists no comprehensive picture of this behavior, particularly where polymer-polymer interactions have the potential to affect charging properties significantly. In this study, we elucidate the novel interplay between weak polyelectrolyte charging and complexation behavior through coupled molecular dynamics and Monte Carlo simulations. Specifically, we investigate a model of two equal-length and oppositely charging polymer chains in an implicit salt solution represented through Debye-Hückel interactions. The charging tendency of each chain, along with the salt concentration, is varied to determine the existence and extent of cooperativity in charging and complexation. Strong cooperation in the charging of these chains is observed at large Debye lengths, corresponding to low salt concentrations, while at lower Debye lengths (higher salt concentrations), the chains behave in apparent isolation. When the electrostatic coupling is long-ranged, we find that a highly charged chain strongly promotes the charging of its partner chain, even if the environment is unfavorable for an isolated version of that partner chain. Evidence of this phenomenon is supported by a drop in the potential energy of the system, which does not occur at the lower Debye lengths where both potential energies and charge fractions converge for all partner chain charging tendencies. The discovery of this cooperation will be helpful in developing "smart" drug delivery mechanisms by allowing for better predictions for the dissociation point of delivery complexes.
RESUMO
A machine learning assisted method is presented for molecular simulation of systems with rugged free energy landscapes. The method is general and can be combined with other advanced sampling techniques. In the particular implementation proposed here, it is illustrated in the context of an adaptive biasing force approach where, rather than relying on discrete force estimates, one can resort to a self-regularizing artificial neural network to generate continuous, estimated generalized forces. By doing so, the proposed approach addresses several shortcomings common to adaptive biasing force and other algorithms. Specifically, the neural network enables (1) smooth estimates of generalized forces in sparsely sampled regions, (2) force estimates in previously unexplored regions, and (3) continuous force estimates with which to bias the simulation, as opposed to biases generated at specific points of a discrete grid. The usefulness of the method is illustrated with three different examples, chosen to highlight the wide range of applicability of the underlying concepts. In all three cases, the new method is found to enhance considerably the underlying traditional adaptive biasing force approach. The method is also found to provide improvements over previous implementations of neural network assisted algorithms.
RESUMO
Molecular simulation has emerged as an essential tool for modern-day research, but obtaining proper results and making reliable conclusions from simulations requires adequate sampling of the system under consideration. To this end, a variety of methods exist in the literature that can enhance sampling considerably, and increasingly sophisticated, effective algorithms continue to be developed at a rapid pace. Implementation of these techniques, however, can be challenging for experts and non-experts alike. There is a clear need for software that provides rapid, reliable, and easy access to a wide range of advanced sampling methods and that facilitates implementation of new techniques as they emerge. Here we present SSAGES, a publicly available Software Suite for Advanced General Ensemble Simulations designed to interface with multiple widely used molecular dynamics simulations packages. SSAGES allows facile application of a variety of enhanced sampling techniques-including adaptive biasing force, string methods, and forward flux sampling-that extract meaningful free energy and transition path data from all-atom and coarse-grained simulations. A noteworthy feature of SSAGES is a user-friendly framework that facilitates further development and implementation of new methods and collective variables. In this work, the use of SSAGES is illustrated in the context of simple representative applications involving distinct methods and different collective variables that are available in the current release of the suite. The code may be found at: https://github.com/MICCoM/SSAGES-public.
RESUMO
Utilizing density-of-states simulations, we perform a full mapping of the phase behavior and elastic responses of binary liquid crystalline mixtures represented by the multicomponent Lebwohl-Lasher model. Our techniques are able to characterize the complete phase diagram, including nematic-nematic phase separation predicted by mean-field theories, but previously not observed in simulations. Mapping this phase diagram permits detailed study of elastic properties across the miscible nematic region. Importantly, we observe for the first time local phase separation and disordering driven by the application of small linear perturbations near the transition temperature and more significantly through nonlinear stresses. These findings are of key importance in systems of blended nematics which contain particulate inclusions, or are otherwise confined.
RESUMO
Background: A large share of SARS-CoV-2 infections now occur among previously infected individuals. In this study, we sought to determine whether prior infection modifies disease severity relative to no prior infection. Methods: We used data from first and second COVID-19 episodes in the National COVID Cohort Collaborative, a nationwide collection of de-identified electronic health records. We used nested logistic regressions of monthly cohorts weighted on the inverse probability of prior infection to assess risk of hospitalization, death, and increased severity in the first versus second infection cohorts. Results: We included a total of 2,058,274 individuals in the analysis, 147,592 of whom had two recorded infections. The impact of prior infection differed meaningfully between months. Prior infection was largely protective prior to March 2022, with odds ratios (ORs) as low as 0.66 (95% confidence interval: 0.51 to 0.86) in November 2021 for hospitalization. and as low as 0.23 (0.06 to 0.86) in June 2021 for death. However, prior infection was associated with an increased risk of hospitalization and death, mostly after March 2022 when the ORs were as high as 1.87 (1.26 to 2.80) and 2.99 (1.65 to 5.41) in April 2022, respectively. The overall OR for more severe disease was 1.06 (1.03 to 1.10) among previously infected individuals. Conclusion: In the pandemic's first two years, previously infected patients generally had less severe disease than people without prior infection. During the Omicron era, however, previously infected patients had the same or worse severity of disease as patients without prior infection.
RESUMO
Background: Post-acute sequelae of COVID-19 (PASC) produce significant morbidity, prompting evaluation of interventions that might lower risk. Selective serotonin reuptake inhibitors (SSRIs) potentially could modulate risk of PASC via their central, hypothesized immunomodulatory, and/or antiplatelet properties although clinical trial data are lacking. Materials and Methods: This retrospective study was conducted leveraging real-world clinical data within the National COVID Cohort Collaborative (N3C) to evaluate whether SSRIs with agonist activity at the sigma-1 receptor (S1R) lower the risk of PASC, since agonism at this receptor may serve as a mechanism by which SSRIs attenuate an inflammatory response. Additionally, determine whether the potential benefit could be traced to S1R agonism. Presumed PASC was defined based on a computable PASC phenotype trained on the U09.9 ICD-10 diagnosis code. Results: Of the 17,908 patients identified, 1521 were exposed at baseline to a S1R agonist SSRI, 1803 to a non-S1R agonist SSRI, and 14,584 to neither. Using inverse probability weighting and Poisson regression, relative risk (RR) of PASC was assessed.A 29% reduction in the RR of PASC (0.704 [95% CI, 0.58-0.85]; P = 4 ×10-4) was seen among patients who received an S1R agonist SSRI compared to SSRI unexposed patients and a 21% reduction in the RR of PASC was seen among those receiving an SSRI without S1R agonist activity (0.79 [95% CI, 0.67 - 0.93]; P = 0.005).Thus, SSRIs with and without reported agonist activity at the S1R were associated with a significant decrease in the risk of PASC.
RESUMO
Importance: Post-acute sequelae of COVID-19 (PASC) produce significant morbidity, prompting evaluation of interventions that might lower risk. Selective serotonin reuptake inhibitors (SSRIs) potentially could modulate risk of PASC via their central, hypothesized immunomodulatory, and/or antiplatelet properties and therefore may be postulated to be of benefit in patients with PASC, although clinical trial data are lacking. Objectives: The main objective was to evaluate whether SSRIs with agonist activity at the sigma-1 receptor lower the risk of PASC, since agonism at this receptor may serve as a mechanism by which SSRIs attenuate an inflammatory response. A secondary objective was to determine whether potential benefit could be traced to sigma-1 agonism by evaluating the risk of PASC among recipients of SSRIs that are not S1R agonists. Design: Retrospective study leveraging real-world clinical data within the National COVID Cohort Collaborative (N3C), a large centralized multi-institutional de-identified EHR database. Presumed PASC was defined based on a computable PASC phenotype trained on the U09.9 ICD-10 diagnosis code to more comprehensively identify patients likely to have the condition, since the ICD code has come into wide-spread use only recently. Setting: Population-based study at US medical centers. Participants: Adults (≥ 18 years of age) with a confirmed COVID-19 diagnosis date between October 1, 2021 and April 7, 2022 and at least one follow up visit 45 days post-diagnosis. Of the 17 933 patients identified, 2021 were exposed at baseline to a S1R agonist SSRI, 1328 to a non-S1R agonist SSRI, and 14 584 to neither. Exposures: Exposure at baseline (at or prior to COVID-19 diagnosis) to an SSRI with documented or presumed agonist activity at the S1R (fluvoxamine, fluoxetine, escitalopram, or citalopram), an SSRI without agonist activity at S1R (sertraline, an antagonist, or paroxetine, which does not appreciably bind to the S1R), or none of these agents. Main Outcome and Measurement: Development of PASC based on a previously validated XGBoost-trained algorithm. Using inverse probability weighting and Poisson regression, relative risk (RR) of PASC was assessed. Results: A 26% reduction in the RR of PASC (0.74 [95% CI, 0.63-0.88]; P = 5 × 10-4) was seen among patients who received an S1R agonist SSRI compared to SSRI unexposed patients and a 25% reduction in the RR of PASC was seen among those receiving an SSRI without S1R agonist activity (0.75 [95% CI, 0.62 - 0.90]; P = 0.003) compared to SSRI unexposed patients. Conclusions and Relevance: SSRIs with and without reported agonist activity at the S1R were associated with a significant decrease in the risk of PASC. Future prospective studies are warranted.
RESUMO
Importance: COVID-19 has placed a monumental burden on the health care system globally. Although no longer a public health emergency, there is still a pressing need for effective treatments to prevent hospitalization and death. Paxlovid (nirmatrelvir/ritonavir) is a promising and potentially effective antiviral that has received emergency use authorization by the U.S. FDA. Objective: Determine real world effectiveness of Paxlovid nationwide and investigate disparities between treated and untreated eligible patients. Design/Setting/Participants: Population-based cohort study emulating a target trial, using inverse probability weighted models to balance treated and untreated groups on baseline confounders. Participants were patients with a SARS-CoV-2 positive test or diagnosis (index) date between December 2021 and February 2023 selected from the National COVID Cohort Collaborative (N3C) database who were eligible for Paxlovid treatment. Namely, adults with at least one risk factor for severe COVID-19 illness, no contraindicated medical conditions, not using one or more strictly contraindicated medications, and not hospitalized within three days of index. From this cohort we identified patients who were treated with Paxlovid within 5 days of positive test or diagnosis (n = 98,060) and patients who either did not receive Paxlovid or were treated outside the 5-day window (n = 913,079 never treated; n = 1,771 treated after 5 days). Exposures: Treatment with Paxlovid within 5 days of positive COVID-19 test or diagnosis. Main Outcomes and Measures: Hospitalization and death in the 28 days following COVID-19 index date. Results: A total of 1,012,910 COVID-19 positive patients at risk for severe COVID-19 were included, 9.7% of whom were treated with Paxlovid. Uptake varied widely by geographic region and timing, with top adoption areas near 50% and bottom near 0%. Adoption increased rapidly after EUA, reaching steady state by 6/2022. Participants who were treated with Paxlovid had a 26% (RR, 0.742; 95% CI, 0.689-0.812) reduction in hospitalization risk and 73% (RR, 0.269, 95% CI, 0.179-0.370) reduction in mortality risk in the 28 days following COVID-19 index date. Conclusions/Relevance: Paxlovid is effective in preventing hospitalization and death in at-risk COVID-19 patients. These results were robust to a large number of sensitivity considerations. Disclosure: The authors report no disclosures. Key points: Question: Is treatment with Paxlovid (nirmatrelvir/ritonavir) associated with a reduction in 28-day hospitalization and mortality in patients at risk for severe COVID-19? Findings: In this multi-institute retrospective cohort study of 1,012,910 patients, Paxlovid treatment within 5 days after COVID-19 diagnosis reduced 28-day hospitalization and mortality by 26% and 73% respectively, compared to no treatment with Paxlovid within 5 days. Paxlovid uptake was low overall (9.7%) and highly variable. Meaning: In Paxlovid-eligible patients, treatment was associated with decreased risk of hospitalization and death. Results align with prior randomized trials and observational studies, thus supporting the real-world effectiveness of Paxlovid.
RESUMO
Importance: Identifying individuals with a higher risk of developing severe COVID-19 outcomes will inform targeted or more intensive clinical monitoring and management. Objective: To examine, using data from the National COVID Cohort Collaborative (N3C), whether patients with pre-existing autoimmune disease (AID) diagnosis and/or immunosuppressant (IS) exposure are at a higher risk of developing severe COVID-19 outcomes. Design setting and participants: A retrospective cohort of 2,453,799 individuals diagnosed with COVID-19 between January 1 st , 2020, and June 30 th , 2022, was created from the N3C data enclave, which comprises data of 15,231,849 patients from 75 USA data partners. Patients were stratified as those with/without a pre-existing diagnosis of AID and/or those with/without exposure to IS prior to COVID-19. Main outcomes and measures: Two outcomes of COVID-19 severity, derived from the World Health Organization severity score, were defined, namely life-threatening disease and hospitalization. Odds ratios (ORs) with 95% confidence intervals (CIs) were calculated using logistic regression models with and without adjustment for demographics (age, BMI, gender, race, ethnicity, smoking status), and comorbidities (cardiovascular disease, dementia, pulmonary disease, liver disease, type 2 diabetes mellitus, kidney disease, cancer, and HIV infection). Results: In total, 2,453,799 (16.11% of the N3C cohort) adults (age> 18 years) were diagnosed with COVID-19, of which 191,520 (7.81%) had a prior AID diagnosis, and 278,095 (11.33%) had a prior IS exposure. Logistic regression models adjusted for demographic factors and comorbidities demonstrated that individuals with a prior AID (OR = 1.13, 95% CI 1.09 - 1.17; p =2.43E-13), prior exposure to IS (OR= 1.27, 95% CI 1.24 - 1.30; p =3.66E-74), or both (OR= 1.35, 95% CI 1.29 - 1.40; p =7.50E-49) were more likely to have a life-threatening COVID-19 disease. These results were confirmed after adjusting for exposure to antivirals and vaccination in a cohort subset with COVID-19 diagnosis dates after December 2021 (AID OR = 1.18, 95% CI 1.02 - 1.36; p =2.46E-02; IS OR= 1.60, 95% CI 1.41 - 1.80; p =5.11E-14; AID+IS OR= 1.93, 95% CI 1.62 - 2.30; p =1.68E-13). These results were consistent when evaluating hospitalization as the outcome and also when stratifying by race and sex. Finally, a sensitivity analysis evaluating specific IS revealed that TNF inhibitors were protective against life-threatening disease (OR = 0.80, 95% CI 0.66-0.96; p =1.66E-2) and hospitalization (OR = 0.80, 95% CI 0.73 - 0.89; p =1.06E-05). Conclusions and Relevance: Patients with pre-existing AID, exposure to IS, or both are more likely to have a life-threatening disease or hospitalization. These patients may thus require tailored monitoring and preventative measures to minimize negative consequences of COVID-19.
RESUMO
A growing community is constructing a next-generation file format (NGFF) for bioimaging to overcome problems of scalability and heterogeneity. Organized by the Open Microscopy Environment (OME), individuals and institutes across diverse modalities facing these problems have designed a format specification process (OME-NGFF) to address these needs. This paper brings together a wide range of those community members to describe the cloud-optimized format itself -- OME-Zarr -- along with tools and data resources available today to increase FAIR access and remove barriers in the scientific process. The current momentum offers an opportunity to unify a key component of the bioimaging domain -- the file format that underlies so many personal, institutional, and global data management and analysis tasks.
RESUMO
Small integration time steps limit molecular dynamics (MD) simulations to millisecond time scales. Markov state models (MSMs) and equation-free approaches learn low-dimensional kinetic models from MD simulation data by performing configurational or dynamical coarse-graining of the state space. The learned kinetic models enable the efficient generation of dynamical trajectories over vastly longer time scales than are accessible by MD, but the discretization of configurational space and/or absence of a means to reconstruct molecular configurations precludes the generation of continuous atomistic molecular trajectories. We propose latent space simulators (LSS) to learn kinetic models for continuous atomistic simulation trajectories by training three deep learning networks to (i) learn the slow collective variables of the molecular system, (ii) propagate the system dynamics within this slow latent space, and (iii) generatively reconstruct molecular configurations. We demonstrate the approach in an application to Trp-cage miniprotein to produce novel ultra-long synthetic folding trajectories that accurately reproduce atomistic molecular structure, thermodynamics, and kinetics at six orders of magnitude lower cost than MD. The dramatically lower cost of trajectory generation enables greatly improved sampling and greatly reduced statistical uncertainties in estimated thermodynamic averages and kinetic rates.
RESUMO
An adaptive, machine learning-based sampling method is presented for simulation of systems having rugged, multidimensional free energy landscapes. The method's main strength resides in its ability to learn both from the frequency of visits to distinct states and the generalized force estimates that arise in a system as it evolves in phase space. This is accomplished by introducing a self-integrating artificial neural network, which generates an estimate of the free energy directly from its derivatives. The usefulness of the proposed combined approach is examined in the context of two concrete examples, namely, an alanine dipeptide molecule in water and a polymer diffusing through a narrow pore. This new method is found to be robust, faster, and more accurate than approaches that rely only on frequency-based or generalized force-based estimations. After combining the proposed approach with overfill protection and support for sparse data storage and training, the method is shown to be more effective than comparable, previously available techniques and capable of scaling efficiently to larger numbers of collective variables.