RESUMO
As recently demonstrated by the COVID-19 pandemic, large-scale pathogen genomic data are crucial to characterize transmission patterns of human infectious diseases. Yet, current methods to process raw sequence data into analysis-ready variants remain slow to scale, hampering rapid surveillance efforts and epidemiological investigations for disease control. Here, we introduce an accelerated, scalable, reproducible, and cost-effective framework for pathogen genomic variant identification and present an evaluation of its performance and accuracy across benchmark datasets of Plasmodium falciparum malaria genomes. We demonstrate superior performance of the GPU framework relative to standard pipelines with mean execution time and computational costs reduced by 27× and 4.6×, respectively, while delivering 99.9% accuracy at enhanced reproducibility.
Assuntos
COVID-19 , Doenças Transmissíveis , Malária , COVID-19/epidemiologia , COVID-19/genética , Genômica/métodos , Humanos , Pandemias , Reprodutibilidade dos TestesRESUMO
Background: Using a combination of data from routine surveillance, genomic sequencing, and phylogeographic analysis, we tracked the spread and introduction events of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) variants focusing on a large university community. Methods: Here, we sequenced and analyzed 677 high-quality SARS-CoV-2 genomes from positive RNA samples collected from Purdue University students, faculty, and staff who tested positive for the virus between January 2021 and May 2021, comprising an average of 32% of weekly cases across the time frame. Results: Our analysis of circulating SARS-CoV-2 variants over time revealed periods when variants of concern (VOC) Alpha (B.1.1.7) and Iota (B.1.526) reached rapid dominance and documented that VOC Gamma (P.1) was increasing in frequency as campus surveillance was ending. Phylodynamic analysis of Gamma genomes from campus alongside a subsampling of >20 000 previously published P.1 genomes revealed 10 independent introductions of this variant into the Purdue community, predominantly from elsewhere in the United States, with introductions from within the state of Indiana and from Illinois, and possibly Washington and New York, suggesting a degree of domestic spread. Conclusions: We conclude that a robust and sustained active and passive surveillance program coupled with genomic sequencing during a pandemic offers important insights into the dynamics of pathogen arrival and spread in a campus community and can help guide mitigation measures.
RESUMO
Molecular dynamics (MD) simulation has become one of the key tools to obtain deeper insights into biological systems using various levels of descriptions such as all-atom, united-atom, and coarse-grained models. Recent advances in computing resources and MD programs have significantly accelerated the simulation time and thus increased the amount of trajectory data. Although many laboratories routinely perform MD simulations, analyzing MD trajectories is still time consuming and often a difficult task. ST-analyzer, http://im.bioinformatics.ku.edu/st-analyzer, is a standalone graphical user interface (GUI) toolset to perform various trajectory analyses. ST-analyzer has several outstanding features compared to other existing analysis tools: (i) handling various formats of trajectory files from MD programs, such as CHARMM, NAMD, GROMACS, and Amber, (ii) intuitive web-based GUI environment--minimizing administrative load and reducing burdens on the user from adapting new software environments, (iii) platform independent design--working with any existing operating system, (iv) easy integration into job queuing systems--providing options of batch processing either on the cluster or in an interactive mode, and (v) providing independence between foreground GUI and background modules--making it easier to add personal modules or to recycle/integrate pre-existing scripts utilizing other analysis tools. The current ST-analyzer contains nine main analysis modules that together contain 18 options, including density profile, lipid deuterium order parameters, surface area per lipid, and membrane hydrophobic thickness. This article introduces ST-analyzer with its design, implementation, and features, and also illustrates practical analysis of lipid bilayer simulations.
Assuntos
Gráficos por Computador , Internet , Bicamadas Lipídicas/química , Simulação de Dinâmica Molecular , SoftwareRESUMO
For proteins of known structure, the relative enthalpic stability with respect to wild-type, ΔΔH(U), can be estimated by direct computation of the folded and unfolded state energies. We propose a model by which the change in stability upon mutation can be predicted from all-atom molecular dynamics simulations for the folded state and a peptide-based model for the unfolded state. The unfolding enthalpies are expressed in terms of environmental and hydration-solvent reorganization contributions that readily allow a residue-specific analysis of ΔΔH(U). The method is applied to estimate the relative enthalpic stability of variants with buried charged groups in T4 lysozyme. The predicted relative stabilities are in good agreement with experimental data. Environmental factors are observed to contribute more than hydration to the overall ΔΔH(U). The residue-specific analysis finds that the effects of burying charge are both localized and long-range. The enthalpy for hydration-solvent reorganization varies considerably among different amino-acid types, but because the variant folded state structures are similar to those of the wild-type, the hydration-solvent reorganization contribution to ΔΔH(U) is localized at the mutation site, in contrast to environmental contributions. Overall, mutation of apolar and polar amino acids to charged amino acids are destabilizing, but the reasons are complex and differ from site to site.