Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 368
Filtrar
Mais filtros

Tipo de documento
Intervalo de ano de publicação
1.
Proc Natl Acad Sci U S A ; 120(4): e2216709120, 2023 Jan 24.
Artigo em Inglês | MEDLINE | ID: mdl-36652480

RESUMO

The global automotive industry sprayed over 2.6 billion liters of paint in 2018, much of which through electrostatic rotary bell atomization, a highly complex process involving the fluid mechanics of rapidly rotating thin films tearing apart into micrometer-thin filaments and droplets. Coating operations account for 65% of the energy usage in a typical automotive assembly plant, representing 10,000s of gigawatt-hours each year in the United States alone. Optimization of these processes would allow for improved robustness, reduced material waste, increased throughput, and significantly reduced energy usage. Here, we introduce a high-fidelity mathematical and algorithmic framework to analyze rotary bell atomization dynamics at industrially relevant conditions. Our approach couples laboratory experiment with the development of robust non-Newtonian fluid models; devises high-order accurate numerical methods to compute the coupled bell, paint, and gas dynamics; and efficiently exploits high-performance supercomputing architectures. These advances have yielded insight into key dynamics, including i) parametric trends in film, sheeting, and filament characteristics as a function of fluid rheology, delivery rates, and bell speed; ii) the impact of nonuniform film thicknesses on atomization performance; and iii) an understanding of spray composition via primary and secondary atomization. These findings result in coating design principles that are poised to improve energy- and cost-efficiency in a wide array of industrial and manufacturing settings.

2.
BMC Biol ; 22(1): 13, 2024 Jan 25.
Artigo em Inglês | MEDLINE | ID: mdl-38273258

RESUMO

BACKGROUND: Single-nucleotide polymorphisms (SNPs) are the most widely used form of molecular genetic variation studies. As reference genomes and resequencing data sets expand exponentially, tools must be in place to call SNPs at a similar pace. The genome analysis toolkit (GATK) is one of the most widely used SNP calling software tools publicly available, but unfortunately, high-performance computing versions of this tool have yet to become widely available and affordable. RESULTS: Here we report an open-source high-performance computing genome variant calling workflow (HPC-GVCW) for GATK that can run on multiple computing platforms from supercomputers to desktop machines. We benchmarked HPC-GVCW on multiple crop species for performance and accuracy with comparable results with previously published reports (using GATK alone). Finally, we used HPC-GVCW in production mode to call SNPs on a "subpopulation aware" 16-genome rice reference panel with ~ 3000 resequenced rice accessions. The entire process took ~ 16 weeks and resulted in the identification of an average of 27.3 M SNPs/genome and the discovery of ~ 2.3 million novel SNPs that were not present in the flagship reference genome for rice (i.e., IRGSP RefSeq). CONCLUSIONS: This study developed an open-source pipeline (HPC-GVCW) to run GATK on HPC platforms, which significantly improved the speed at which SNPs can be called. The workflow is widely applicable as demonstrated successfully for four major crop species with genomes ranging in size from 400 Mb to 2.4 Gb. Using HPC-GVCW in production mode to call SNPs on a 25 multi-crop-reference genome data set produced over 1.1 billion SNPs that were publicly released for functional and breeding studies. For rice, many novel SNPs were identified and were found to reside within genes and open chromatin regions that are predicted to have functional consequences. Combined, our results demonstrate the usefulness of combining a high-performance SNP calling architecture solution with a subpopulation-aware reference genome panel for rapid SNP discovery and public deployment.


Assuntos
Genoma de Planta , Polimorfismo de Nucleotídeo Único , Fluxo de Trabalho , Melhoramento Vegetal , Software , Sequenciamento de Nucleotídeos em Larga Escala/métodos
3.
BMC Bioinformatics ; 25(1): 199, 2024 May 24.
Artigo em Inglês | MEDLINE | ID: mdl-38789933

RESUMO

BACKGROUND: Computational models in systems biology are becoming more important with the advancement of experimental techniques to query the mechanistic details responsible for leading to phenotypes of interest. In particular, Boolean models are well fit to describe the complexity of signaling networks while being simple enough to scale to a very large number of components. With the advance of Boolean model inference techniques, the field is transforming from an artisanal way of building models of moderate size to a more automatized one, leading to very large models. In this context, adapting the simulation software for such increases in complexity is crucial. RESULTS: We present two new developments in the continuous time Boolean simulators: MaBoSS.MPI, a parallel implementation of MaBoSS which can exploit the computational power of very large CPU clusters, and MaBoSS.GPU, which can use GPU accelerators to perform these simulations. CONCLUSION: These implementations enable simulation and exploration of the behavior of very large models, thus becoming a valuable analysis tool for the systems biology community.


Assuntos
Simulação por Computador , Software , Biologia de Sistemas/métodos , Biologia Computacional/métodos , Algoritmos , Gráficos por Computador
4.
Neuroimage ; 291: 120600, 2024 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-38569979

RESUMO

Our knowledge of the organisation of the human brain at the population-level is yet to translate into power to predict functional differences at the individual-level, limiting clinical applications and casting doubt on the generalisability of inferred mechanisms. It remains unknown whether the difficulty arises from the absence of individuating biological patterns within the brain, or from limited power to access them with the models and compute at our disposal. Here we comprehensively investigate the resolvability of such patterns with data and compute at unprecedented scale. Across 23 810 unique participants from UK Biobank, we systematically evaluate the predictability of 25 individual biological characteristics, from all available combinations of structural and functional neuroimaging data. Over 4526 GPU*hours of computation, we train, optimize, and evaluate out-of-sample 700 individual predictive models, including fully-connected feed-forward neural networks of demographic, psychological, serological, chronic disease, and functional connectivity characteristics, and both uni- and multi-modal 3D convolutional neural network models of macro- and micro-structural brain imaging. We find a marked discrepancy between the high predictability of sex (balanced accuracy 99.7%), age (mean absolute error 2.048 years, R2 0.859), and weight (mean absolute error 2.609Kg, R2 0.625), for which we set new state-of-the-art performance, and the surprisingly low predictability of other characteristics. Neither structural nor functional imaging predicted an individual's psychology better than the coincidence of common chronic disease (p < 0.05). Serology predicted chronic disease (p < 0.05) and was best predicted by it (p < 0.001), followed by structural neuroimaging (p < 0.05). Our findings suggest either more informative imaging or more powerful models will be needed to decipher individual level characteristics from the human brain. We make our models and code openly available.


Assuntos
Encéfalo , Imageamento por Ressonância Magnética , Humanos , Pré-Escolar , Imageamento por Ressonância Magnética/métodos , Encéfalo/diagnóstico por imagem , Redes Neurais de Computação , Emoções , Doença Crônica , Neuroimagem/métodos
5.
Brief Bioinform ; 23(6)2022 11 19.
Artigo em Inglês | MEDLINE | ID: mdl-36352504

RESUMO

In shotgun metagenomics (SM), the state-of-the-art bioinformatic workflows are referred to as high-resolution shotgun metagenomics (HRSM) and require intensive computing and disk storage resources. While the increase in data output of the latest iteration of high-throughput DNA sequencing systems can allow for unprecedented sequencing depth at a minimal cost, adjustments in HRSM workflows will be needed to properly process these ever-increasing sequence datasets. One potential adaptation is to generate so-called shallow SM datasets that contain fewer sequencing data per sample as compared with the more classic high coverage sequencing. While shallow sequencing is a promising avenue for SM data analysis, detailed benchmarks using real-data are lacking. In this case study, we took four public SM datasets, one massive and the others moderate in size and subsampled each dataset at various levels to mimic shallow sequencing datasets of various sequencing depths. Our results suggest that shallow SM sequencing is a viable avenue to obtain sound results regarding microbial community structures and that high-depth sequencing does not bring additional elements for ecological interpretation. More specifically, results obtained by subsampling as little as 0.5 M sequencing clusters per sample were similar to the results obtained with the largest subsampled dataset for human gut and agricultural soil datasets. For an Antarctic dataset, which contained only a few samples, 4 M sequencing clusters per sample was found to generate comparable results to the full dataset. One area where ultra-deep sequencing and maximizing the usage of all data was undeniably beneficial was in the generation of metagenome-assembled genomes.


Assuntos
Metagenômica , Microbiota , Humanos , Análise de Sequência de DNA/métodos , Metagenômica/métodos , Metagenoma , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Microbiota/genética
6.
Philos Trans A Math Phys Eng Sci ; 382(2275): 20230305, 2024 Jul 23.
Artigo em Inglês | MEDLINE | ID: mdl-38910407

RESUMO

Physical mechanisms that contribute to the generation of fracture waves in condensed media under intensive dynamic impacts have not been fully studied. One of the hypotheses is that this process is associated with the blocky structure of a material. As the loading wave passes, the compliant interlayers between blocks are fractured, releasing the energy of self-balanced initial stresses in the blocks, which supports the motion of the fracture wave. We propose a new efficient numerical method for the analysis of the wave nature of the propagation of a system of cracks in thin interlayers of a blocky medium with complex rheological properties. The method is based on a variational formulation of the constitutive relations for the deformation of elastic-plastic materials, as well as the conditions for contact interaction of blocks through interlayers. We have developed a parallel computational algorithm that implements this method for supercomputers with cluster architecture. The results of the numerical simulation of the fracture wave propagation in tempered glass under the action of distributed pulse disturbances are presented. This article is part of the theme issue 'Non-smooth variational problems with applications in mechanics'.

7.
Proc Natl Acad Sci U S A ; 118(46)2021 11 16.
Artigo em Inglês | MEDLINE | ID: mdl-34772803

RESUMO

PRACE (Partnership for Advanced Computing in Europe), an international not-for-profit association that brings together the five largest European supercomputing centers and involves 26 European countries, has allocated more than half a billion core hours to computer simulations to fight the COVID-19 pandemic. Alongside experiments, these simulations are a pillar of research to assess the risks of different scenarios and investigate mitigation strategies. While the world deals with the subsequent waves of the pandemic, we present a reflection on the use of urgent supercomputing for global societal challenges and crisis management.


Assuntos
COVID-19/epidemiologia , Computação em Informática Médica/normas , Europa (Continente) , Humanos , Disseminação de Informação , Sistemas de Informação/normas , Computação em Informática Médica/tendências
8.
BMC Bioinformatics ; 24(1): 143, 2023 Apr 12.
Artigo em Inglês | MEDLINE | ID: mdl-37046208

RESUMO

BACKGROUND: Modeling the whole cardiac function involves the solution of several complex multi-physics and multi-scale models that are highly computationally demanding, which call for simpler yet accurate, high-performance computational tools. Despite the efforts made by several research groups, no software for whole-heart fully-coupled cardiac simulations in the scientific community has reached full maturity yet. RESULTS: In this work we present [Formula: see text]-fiber, an innovative tool for the generation of myocardial fibers based on Laplace-Dirichlet Rule-Based Methods, which are the essential building blocks for modeling the electrophysiological, mechanical and electromechanical cardiac function, from single-chamber to whole-heart simulations. [Formula: see text]-fiber is the first publicly released module for cardiac simulations based on [Formula: see text], an open-source, high-performance Finite Element solver for multi-physics, multi-scale and multi-domain problems developed in the framework of the iHEART project, which aims at making in silico experiments easily reproducible and accessible to a wide community of users, including those with a background in medicine or bio-engineering. CONCLUSIONS: The tool presented in this document is intended to provide the scientific community with a computational tool that incorporates general state of the art models and solvers for simulating the cardiac function within a high-performance framework that exposes a user- and developer-friendly interface. This report comes with an extensive technical and mathematical documentation to welcome new users to the core structure of [Formula: see text]-fiber and to provide them with a possible approach to include the generated cardiac fibers into more sophisticated computational pipelines. In the near future, more modules will be successively published either as pre-compiled binaries for x86-64 Linux systems or as open source software.


Assuntos
Medicina , Software , Miócitos Cardíacos , Simulação por Computador
9.
BMC Bioinformatics ; 24(1): 389, 2023 Oct 13.
Artigo em Inglês | MEDLINE | ID: mdl-37828428

RESUMO

BACKGROUND: Simulating the cardiac function requires the numerical solution of multi-physics and multi-scale mathematical models. This underscores the need for streamlined, accurate, and high-performance computational tools. Despite the dedicated endeavors of various research teams, comprehensive and user-friendly software programs for cardiac simulations, capable of accurately replicating both normal and pathological conditions, are still in the process of achieving full maturity within the scientific community. RESULTS: This work introduces [Formula: see text]-ep, a publicly available software for numerical simulations of the electrophysiology activity of the cardiac muscle, under both normal and pathological conditions. [Formula: see text]-ep employs the monodomain equation to model the heart's electrical activity. It incorporates both phenomenological and second-generation ionic models. These models are discretized using the Finite Element method on tetrahedral or hexahedral meshes. Additionally, [Formula: see text]-ep integrates the generation of myocardial fibers based on Laplace-Dirichlet Rule-Based Methods, previously released in Africa et al., 2023, within [Formula: see text]-fiber. As an alternative, users can also choose to import myofibers from a file. This paper provides a concise overview of the mathematical models and numerical methods underlying [Formula: see text]-ep, along with comprehensive implementation details and instructions for users. [Formula: see text]-ep features exceptional parallel speedup, scaling efficiently when using up to thousands of cores, and its implementation has been verified against an established benchmark problem for computational electrophysiology. We showcase the key features of [Formula: see text]-ep through various idealized and realistic simulations conducted in both normal and pathological scenarios. Furthermore, the software offers a user-friendly and flexible interface, simplifying the setup of simulations using self-documenting parameter files. CONCLUSIONS: [Formula: see text]-ep provides easy access to cardiac electrophysiology simulations for a wide user community. It offers a computational tool that integrates models and accurate methods for simulating cardiac electrophysiology within a high-performance framework, while maintaining a user-friendly interface. [Formula: see text]-ep represents a valuable tool for conducting in silico patient-specific simulations.


Assuntos
Técnicas Eletrofisiológicas Cardíacas , Software , Humanos , Simulação por Computador , Miocárdio , África
10.
BMC Bioinformatics ; 24(1): 133, 2023 Apr 04.
Artigo em Inglês | MEDLINE | ID: mdl-37016291

RESUMO

BACKGROUND: RNA-seq followed by de novo transcriptome assembly has been a transformative technique in biological research of non-model organisms, but the computational processing of RNA-seq data entails many different software tools. The complexity of these de novo transcriptomics workflows therefore presents a major barrier for researchers to adopt best-practice methods and up-to-date versions of software. RESULTS: Here we present a streamlined and universal de novo transcriptome assembly and annotation pipeline, transXpress, implemented in Snakemake. transXpress supports two popular assembly programs, Trinity and rnaSPAdes, and allows parallel execution on heterogeneous cluster computing hardware. CONCLUSIONS: transXpress simplifies the use of best-practice methods and up-to-date software for de novo transcriptome assembly, and produces standardized output files that can be mined using SequenceServer to facilitate rapid discovery of new genes and proteins in non-model organisms.


Assuntos
Software , Transcriptoma , Análise de Sequência de RNA/métodos , RNA-Seq , Perfilação da Expressão Gênica , Anotação de Sequência Molecular
11.
Mol Biol Evol ; 39(5)2022 05 03.
Artigo em Inglês | MEDLINE | ID: mdl-35552742

RESUMO

Bayesian phylogenetics has gained substantial popularity in the last decade, with most implementations relying on Markov chain Monte Carlo (MCMC). The computational demands of MCMC mean that remote servers are increasingly used. We present Beastiary, a package for real-time and remote inspection of log files generated by MCMC analyses. Beastiary is an easily deployed web-app that can be used to summarize and visualize the output of many popular software packages including BEAST, BEAST2, RevBayes, and MrBayes via a web browser. We describe the design and implementation of Beastiary and some typical use-cases, with a focus on real-time remote monitoring.


Assuntos
Software , Teorema de Bayes , Cadeias de Markov , Método de Monte Carlo , Filogenia
12.
J Comput Chem ; 44(13): 1250-1262, 2023 Feb 27.
Artigo em Inglês | MEDLINE | ID: mdl-36847779

RESUMO

The nucleation of sulfuric acid-water clusters is a significant contribution to the formation of aerosols as precursors of cloud condensation nuclei (CCN). Depending on the temperature, there is an interplay between the clustering of particles and their evaporation controlling the efficiency of cluster growth. For typical temperatures in the atmosphere, the evaporation of H2 SO4 H2 O clusters is more efficient than the clustering of the first, small clusters, and thus their growth is dampened at its early stages. Since the evaporation rates of small clusters containing an HSO 4 - ion are much smaller than for purely neutral sulfuric acid clusters, they can serve as a central body for the further attachment of H2 SO4 H2 O molecules. We here present an innovative Monte Carlo model to study the growth of aqueous sulfuric acid clusters around central ions. Unlike classical thermodynamic nucleation theory or kinetic models, this model allows to trace individual particles and thus to determine properties for each individual particle. As a benchmarking case, we have performed simulations at T = 300 K a relative humidity of 50% with dipole and ion concentrations of c dipole = 5 × 10 8 - 10 9 cm - 3 and c ion = 0 - 10 7 cm - 3 . We discuss the runtime of our simulations and present the velocity distribution of ionic clusters, the size distribution of the clusters as well as the formation rate of clusters with radii R ≥ 0.85 nm . Simulations give reasonable velocity and size distributions and there is a good agreement of the formation rates with previous results, including the relevance of ions for the initial growth of sulfuric acid-water clusters. Conclusively, we present a computational method which allows studying detailed particle properties during the growth of aerosols as a precursor of CCN.

13.
J Comput Chem ; 44(20): 1740-1749, 2023 07 30.
Artigo em Inglês | MEDLINE | ID: mdl-37141320

RESUMO

Generalized replica exchange with solute tempering (gREST) is one of the enhanced sampling algorithms for proteins or other systems with rugged energy landscapes. Unlike the replica-exchange molecular dynamics (REMD) method, solvent temperatures are the same in all replicas, while solute temperatures are different and are exchanged frequently between replicas for exploring various solute structures. Here, we apply the gREST scheme to large biological systems containing over one million atoms using a large number of processors in a supercomputer. First, communication time on a multi-dimensional torus network is reduced by matching each replica to MPI processors optimally. This is applicable not only to gREST but also to other multi-copy algorithms. Second, energy evaluations, which are necessary for the multistate bennet acceptance ratio (MBAR) method for free energy estimations, are performed on-the-fly during the gREST simulations. Using these two advanced schemes, we observed 57.72 ns/day performance in 128-replica gREST calculations with 1.5 million atoms system using 16,384 nodes in Fugaku. These schemes implemented in the latest version of GENESIS software could open new possibilities to answer unresolved questions on large biomolecular complex systems with slow conformational dynamics.


Assuntos
Simulação de Dinâmica Molecular , Proteínas , Proteínas/química , Software , Temperatura , Aceleração
14.
Philos Trans A Math Phys Eng Sci ; 381(2250): 20220251, 2023 Jul 10.
Artigo em Inglês | MEDLINE | ID: mdl-37211037

RESUMO

Amorphous materials have no long-range order in their atomic structure. This makes much of the formalism for the study of crystalline materials irrelevant, and so elucidating their structure and properties is challenging. The use of computational methods is a powerful complement to experimental studies, and in this paper we review the use of high-performance computing methods in the simulation of amorphous materials. Five case studies are presented to showcase the wide range of materials and computational methods available to practitioners in this field. This article is part of a discussion meeting issue 'Supercomputing simulations of advanced materials'.

15.
Philos Trans A Math Phys Eng Sci ; 381(2246): 20220297, 2023 May.
Artigo em Inglês | MEDLINE | ID: mdl-36907220

RESUMO

Previous comparisons of experimental data with nonlinear numerical simulations of density stratified Taylor-Couette (TC) flows revealed nonlinear interactions of strato-rotational instability (SRI) modes that lead to periodic changes in the SRI spirals and their axial propagation. These pattern changes are associated with low-frequency velocity modulations that are related to the dynamics of two competing spiral wave modes propagating in opposite directions. In the present paper, a parameter study of the SRI is performed using direct numerical simulations to evaluate the influence of the Reynolds numbers, the stratification, and of the container geometry on these SRI low-frequency modulations and spiral pattern changes. The results of this parameter study show that the modulations can be considered as a secondary instability that are not observed for all SRI unstable regimes. The findings are of interest when the TC model is related to star formation processes in accretion discs. This article is part of the theme issue 'Taylor-Couette and related flows on the centennial of Taylor's seminal Philosophical Transactions paper (Part 2)'.

16.
Sensors (Basel) ; 24(1)2023 Dec 27.
Artigo em Inglês | MEDLINE | ID: mdl-38203006

RESUMO

The computational performance requirements of space payloads are constantly increasing, and the redevelopment of space-grade processors requires a significant amount of time and is costly. This study investigates performance evaluation benchmarks for processors designed for various application scenarios. It also constructs benchmark modules and typical space application benchmarks specifically tailored for the space domain. Furthermore, the study systematically evaluates and analyzes the performance of NVIDIA Jetson AGX Xavier platform and Loongson platforms to identify processors that are suitable for space missions. The experimental results of the evaluation demonstrate that Jetson AGX Xavier performs exceptionally well and consumes less power during dense computations. The Loongson platform can achieve 80% of Xavier's performance in certain parallel optimized computations, surpassing Xavier's performance at the expense of higher power consumption.

17.
Sensors (Basel) ; 23(6)2023 Mar 10.
Artigo em Inglês | MEDLINE | ID: mdl-36991712

RESUMO

This research describes the use of high-performance computing (HPC) and deep learning to create prediction models that could be deployed on edge AI devices equipped with camera and installed in poultry farms. The main idea is to leverage an existing IoT farming platform and use HPC offline to run deep learning to train the models for object detection and object segmentation, where the objects are chickens in images taken on farm. The models can be ported from HPC to edge AI devices to create a new type of computer vision kit to enhance the existing digital poultry farm platform. Such new sensors enable implementing functions such as counting chickens, detection of dead chickens, and even assessing their weight or detecting uneven growth. These functions combined with the monitoring of environmental parameters, could enable early disease detection and improve the decision-making process. The experiment focused on Faster R-CNN architectures and AutoML was used to identify the most suitable architecture for chicken detection and segmentation for the given dataset. For the selected architectures, further hyperparameter optimization was carried out and we achieved the accuracy of AP = 85%, AP50 = 98%, and AP75 = 96% for object detection and AP = 90%, AP50 = 98%, and AP75 = 96% for instance segmentation. These models were installed on edge AI devices and evaluated in the online mode on actual poultry farms. Initial results are promising, but further development of the dataset and improvements in prediction models is needed.


Assuntos
Aprendizado Profundo , Aves Domésticas , Animais , Fazendas , Galinhas , Computadores
18.
Int J High Perform Comput Appl ; 37(1): 4-27, 2023 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38603425

RESUMO

This paper describes an integrated, data-driven operational pipeline based on national agent-based models to support federal and state-level pandemic planning and response. The pipeline consists of (i) an automatic semantic-aware scheduling method that coordinates jobs across two separate high performance computing systems; (ii) a data pipeline to collect, integrate and organize national and county-level disaggregated data for initialization and post-simulation analysis; (iii) a digital twin of national social contact networks made up of 288 Million individuals and 12.6 Billion time-varying interactions covering the US states and DC; (iv) an extension of a parallel agent-based simulation model to study epidemic dynamics and associated interventions. This pipeline can run 400 replicates of national runs in less than 33 h, and reduces the need for human intervention, resulting in faster turnaround times and higher reliability and accuracy of the results. Scientifically, the work has led to significant advances in real-time epidemic sciences.

19.
BMC Bioinformatics ; 23(1): 544, 2022 Dec 16.
Artigo em Inglês | MEDLINE | ID: mdl-36526957

RESUMO

BACKGROUND: The Basic Local Alignment Search Tool (BLAST) is a suite of commonly used algorithms for identifying matches between biological sequences. The user supplies a database file and query file of sequences for BLAST to find identical sequences between the two. The typical millions of database and query sequences make BLAST computationally challenging but also well suited for parallelization on high-performance computing clusters. The efficacy of parallelization depends on the data partitioning, where the optimal data partitioning relies on an accurate performance model. In previous studies, a BLAST job was sped up by 27 times by partitioning the database and query among thousands of processor nodes. However, the optimality of the partitioning method was not studied. Unlike BLAST performance models proposed in the literature that usually have problem size and hardware configuration as the only variables, the execution time of a BLAST job is a function of database size, query size, and hardware capability. In this work, the nucleotide BLAST application BLASTN was profiled using three methods: shell-level profiling with the Unix "time" command, code-level profiling with the built-in "profiler" module, and system-level profiling with the Unix "gprof" program. The runtimes were measured for six node types, using six different database files and 15 query files, on a heterogeneous HPC cluster with 500+ nodes. The empirical measurement data were fitted with quadratic functions to develop performance models that were used to guide the data parallelization for BLASTN jobs. RESULTS: Profiling results showed that BLASTN contains more than 34,500 different functions, but a single function, RunMTBySplitDB, takes 99.12% of the total runtime. Among its 53 child functions, five core functions were identified to make up 92.12% of the overall BLASTN runtime. Based on the performance models, static load balancing algorithms can be applied to the BLASTN input data to minimize the runtime of the longest job on an HPC cluster. Four test cases being run on homogeneous and heterogeneous clusters were tested. Experiment results showed that the runtime can be reduced by 81% on a homogeneous cluster and by 20% on a heterogeneous cluster by re-distributing the workload. DISCUSSION: Optimal data partitioning can improve BLASTN's overall runtime 5.4-fold in comparison with dividing the database and query into the same number of fragments. The proposed methodology can be used in the other applications in the BLAST+ suite or any other application as long as source code is available.


Assuntos
Metodologias Computacionais , Software , Algoritmos , Biologia Computacional/métodos , Alinhamento de Sequência
20.
Neuroimage ; 251: 118973, 2022 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-35131433

RESUMO

The Virtual Brain (TVB) is now available as open-source services on the cloud research platform EBRAINS (ebrains.eu). It offers software for constructing, simulating and analysing brain network models including the TVB simulator; magnetic resonance imaging (MRI) processing pipelines to extract structural and functional brain networks; combined simulation of large-scale brain networks with small-scale spiking networks; automatic conversion of user-specified model equations into fast simulation code; simulation-ready brain models of patients and healthy volunteers; Bayesian parameter optimization in epilepsy patient models; data and software for mouse brain simulation; and extensive educational material. TVB cloud services facilitate reproducible online collaboration and discovery of data assets, models, and software embedded in scalable and secure workflows, a precondition for research on large cohort data sets, better generalizability, and clinical translation.


Assuntos
Encéfalo , Computação em Nuvem , Animais , Teorema de Bayes , Encéfalo/diagnóstico por imagem , Simulação por Computador , Humanos , Imageamento por Ressonância Magnética/métodos , Camundongos , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA