Pesquisa | Portal Regional da BVS

1.

Optimizing storage on fog computing edge servers: A recent algorithm design with minimal interference.

Zhao, Xumin; Xie, Guojie; Luo, Yi; Chen, Jingyuan; Liu, Fenghua; Bai, HongPeng.

PLoS One ; 19(7): e0304009, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38985790

RESUMO

The burgeoning field of fog computing introduces a transformative computing paradigm with extensive applications across diverse sectors. At the heart of this paradigm lies the pivotal role of edge servers, which are entrusted with critical computing and storage functions. The optimization of these servers' storage capacities emerges as a crucial factor in augmenting the efficacy of fog computing infrastructures. This paper presents a novel storage optimization algorithm, dubbed LIRU (Low Interference Recently Used), which synthesizes the strengths of the LIRS (Low Interference Recency Set) and LRU (Least Recently Used) replacement algorithms. Set against the backdrop of constrained storage resources, this research endeavours to formulate an algorithm that optimizes storage space utilization, elevates data access efficiency, and diminishes access latencies. The investigation initiates a comprehensive analysis of the storage resources available on edge servers, pinpointing the essential considerations for optimization algorithms: storage resource utilization and data access frequency. The study then constructs an optimization model that harmonizes data frequency with cache capacity, employing optimization theory to discern the optimal solution for storage maximization. Subsequent experimental validations of the LIRU algorithm underscore its superiority over conventional replacement algorithms, showcasing significant improvements in storage utilization, data access efficiency, and reduced access delays. Notably, the LIRU algorithm registers a 5% increment in one-hop hit ratio relative to the LFU algorithm, a 66% enhancement over the LRU algorithm, and a 14% elevation in system hit ratio against the LRU algorithm. Moreover, it curtails the average system response time by 2.4% and 16.5% compared to the LRU and LFU algorithms, respectively, particularly in scenarios involving large cache sizes. This research not only sheds light on the intricacies of edge server storage optimization but also significantly propels the performance and efficiency of the broader fog computing ecosystem. Through these insights, the study contributes a valuable framework for enhancing data management strategies within fog computing architectures, marking a noteworthy advancement in the field.

Assuntos

Algoritmos , Armazenamento e Recuperação da Informação/métodos , Computação em Nuvem

2.

Self-learning activation functions to increase accuracy of privacy-preserving Convolutional Neural Networks with homomorphic encryption.

Pulido-Gaytan, Bernardo; Tchernykh, Andrei.

PLoS One ; 19(7): e0306420, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-39038028

RESUMO

The widespread adoption of cloud computing necessitates privacy-preserving techniques that allow information to be processed without disclosure. This paper proposes a method to increase the accuracy and performance of privacy-preserving Convolutional Neural Networks with Homomorphic Encryption (CNN-HE) by Self-Learning Activation Functions (SLAF). SLAFs are polynomials with trainable coefficients updated during training, together with synaptic weights, for each polynomial independently to learn task-specific and CNN-specific features. We theoretically prove its feasibility to approximate any continuous activation function to the desired error as a function of the SLAF degree. Two CNN-HE models are proposed: CNN-HE-SLAF and CNN-HE-SLAF-R. In the first model, all activation functions are replaced by SLAFs, and CNN is trained to find weights and coefficients. In the second one, CNN is trained with the original activation, then weights are fixed, activation is substituted by SLAF, and CNN is shortly re-trained to adapt SLAF coefficients. We show that such self-learning can achieve the same accuracy 99.38% as a non-polynomial ReLU over non-homomorphic CNNs and lead to an increase in accuracy (99.21%) and higher performance (6.26 times faster) than the state-of-the-art CNN-HE CryptoNets on the MNIST optical character recognition benchmark dataset.

Assuntos

Segurança Computacional , Redes Neurais de Computação , Privacidade , Humanos , Algoritmos , Computação em Nuvem

3.

CloudATAC: a cloud-based framework for ATAC-Seq data analysis.

Veerappa, Avinash M; Rowley, M Jordan; Maggio, Angela; Beaudry, Laura; Hawkins, Dale; Kim, Allen; Sethi, Sahil; Sorgen, Paul L; Guda, Chittibabu.

Brief Bioinform ; 25(Supplement_1)2024 Jul 23.

Artigo em Inglês | MEDLINE | ID: mdl-39041910

RESUMO

Assay for transposase-accessible chromatin with high-throughput sequencing (ATAC-seq) generates genome-wide chromatin accessibility profiles, providing valuable insights into epigenetic gene regulation at both pooled-cell and single-cell population levels. Comprehensive analysis of ATAC-seq data involves the use of various interdependent programs. Learning the correct sequence of steps needed to process the data can represent a major hurdle. Selecting appropriate parameters at each stage, including pre-analysis, core analysis, and advanced downstream analysis, is important to ensure accurate analysis and interpretation of ATAC-seq data. Additionally, obtaining and working within a limited computational environment presents a significant challenge to non-bioinformatic researchers. Therefore, we present Cloud ATAC, an open-source, cloud-based interactive framework with a scalable, flexible, and streamlined analysis framework based on the best practices approach for pooled-cell and single-cell ATAC-seq data. These frameworks use on-demand computational power and memory, scalability, and a secure and compliant environment provided by the Google Cloud. Additionally, we leverage Jupyter Notebook's interactive computing platform that combines live code, tutorials, narrative text, flashcards, quizzes, and custom visualizations to enhance learning and analysis. Further, leveraging GPU instances has significantly improved the run-time of the single-cell framework. The source codes and data are publicly available through NIH Cloud lab https://github.com/NIGMS/ATAC-Seq-and-Single-Cell-ATAC-Seq-Analysis. This manuscript describes the development of a resource module that is part of a learning platform named ``NIGMS Sandbox for Cloud-based Learning'' https://github.com/NIGMS/NIGMS-Sandbox. The overall genesis of the Sandbox is described in the editorial NIGMS Sandbox [1] at the beginning of this Supplement. This module delivers learning materials on the analysis of bulk and single-cell ATAC-seq data in an interactive format that uses appropriate cloud resources for data access and analyses.

Assuntos

Computação em Nuvem , Sequenciamento de Nucleotídeos em Larga Escala , Software , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Biologia Computacional/métodos , Sequenciamento de Cromatina por Imunoprecipitação/métodos , Análise de Célula Única/métodos , Cromatina/genética , Cromatina/metabolismo

4.

Cloud-based introduction to BASH programming for biologists.

Wilkins, Owen M; Campbell, Ross; Yosufzai, Zelaikha; Doe, Valena; Soucy, Shannon M.

Brief Bioinform ; 25(Supplement_1)2024 Jul 23.

Artigo em Inglês | MEDLINE | ID: mdl-39041911

RESUMO

This manuscript describes the development of a resource module that is part of a learning platform named 'NIGMS Sandbox for Cloud-based Learning', https://github.com/NIGMS/NIGMS-Sandbox. The overall genesis of the Sandbox is described in the editorial authored by National Institute of General Medical Sciences: NIGMS Sandbox: A Learning Platform toward Democratizing Cloud Computing for Biomedical Research at the beginning of this supplement. This module delivers learning materials introducing the utility of the BASH (Bourne Again Shell) programming language for genomic data analysis in an interactive format that uses appropriate cloud resources for data access and analyses. The next-generation sequencing revolution has generated massive amounts of novel biological data from a multitude of platforms that survey an ever-growing list of genomic modalities. These data require significant downstream computational and statistical analyses to glean meaningful biological insights. However, the skill sets required to generate these data are vastly different from the skills required to analyze these data. Bench scientists that generate next-generation data often lack the training required to perform analysis of these datasets and require support from bioinformatics specialists. Dedicated computational training is required to empower biologists in the area of genomic data analysis, however, learning to efficiently leverage a command line interface is a significant barrier in learning how to leverage common analytical tools. Cloud platforms have the potential to democratize access to the technical tools and computational resources necessary to work with modern sequencing data, providing an effective framework for bioinformatics education. This module aims to provide an interactive platform that slowly builds technical skills and knowledge needed to interact with genomics data on the command line in the Cloud. The sandbox format of this module enables users to move through the material at their own pace and test their grasp of the material with knowledge self-checks before building on that material in the next sub-module. This manuscript describes the development of a resource module that is part of a learning platform named ``NIGMS Sandbox for Cloud-based Learning'' https://github.com/NIGMS/NIGMS-Sandbox. The overall genesis of the Sandbox is described in the editorial NIGMS Sandbox [1] at the beginning of this Supplement. This module delivers learning materials on the analysis of bulk and single-cell ATAC-seq data in an interactive format that uses appropriate cloud resources for data access and analyses.

Assuntos

Computação em Nuvem , Biologia Computacional , Software , Biologia Computacional/métodos , Linguagens de Programação , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Genômica/métodos , Humanos

5.

A cloud-based learning module for biomarker discovery.

Hemme, Christopher L; Beaudry, Laura; Yosufzai, Zelaikha; Kim, Allen; Pan, Daniel; Campbell, Ross; Price, Marcia; Cho, Bongsup P.

Brief Bioinform ; 25(Supplement_1)2024 Jul 23.

Artigo em Inglês | MEDLINE | ID: mdl-39041912

RESUMO

This manuscript describes the development of a resource module that is part of a learning platform named "NIGMS Sandbox for Cloud-based Learning" https://github.com/NIGMS/NIGMS-Sandbox. The overall genesis of the Sandbox is described in the editorial NIGMS Sandbox at the beginning of this Supplement. This module delivers learning materials on basic principles in biomarker discovery in an interactive format that uses appropriate cloud resources for data access and analyses. In collaboration with Google Cloud, Deloitte Consulting and NIGMS, the Rhode Island INBRE Molecular Informatics Core developed a cloud-based training module for biomarker discovery. The module consists of nine submodules covering various topics on biomarker discovery and assessment and is deployed on the Google Cloud Platform and available for public use through the NIGMS Sandbox. The submodules are written as a series of Jupyter Notebooks utilizing R and Bioconductor for biomarker and omics data analysis. The submodules cover the following topics: 1) introduction to biomarkers; 2) introduction to R data structures; 3) introduction to linear models; 4) introduction to exploratory analysis; 5) rat renal ischemia-reperfusion injury case study; (6) linear and logistic regression for comparison of quantitative biomarkers; 7) exploratory analysis of proteomics IRI data; 8) identification of IRI biomarkers from proteomic data; and 9) machine learning methods for biomarker discovery. Each notebook includes an in-line quiz for self-assessment on the submodule topic and an overview video is available on YouTube (https://www.youtube.com/watch?v=2-Q9Ax8EW84). This manuscript describes the development of a resource module that is part of a learning platform named ``NIGMS Sandbox for Cloud-based Learning'' https://github.com/NIGMS/NIGMS-Sandbox. The overall genesis of the Sandbox is described in the editorial NIGMS Sandbox [1] at the beginning of this Supplement. This module delivers learning materials on the analysis of bulk and single-cell ATAC-seq data in an interactive format that uses appropriate cloud resources for data access and analyses.

Assuntos

Biomarcadores , Computação em Nuvem , Biomarcadores/metabolismo , Animais , Software , Humanos , Ratos , Aprendizado de Máquina , Biologia Computacional/métodos

6.

CCPA: cloud-based, self-learning modules for consensus pathway analysis using GO, KEGG and Reactome.

Nguyen, Ha; Pham, Van-Dung; Nguyen, Hung; Tran, Bang; Petereit, Juli; Nguyen, Tin.

Brief Bioinform ; 25(Supplement_1)2024 Jul 23.

Artigo em Inglês | MEDLINE | ID: mdl-39041916

RESUMO

This manuscript describes the development of a resource module that is part of a learning platform named 'NIGMS Sandbox for Cloud-based Learning' (https://github.com/NIGMS/NIGMS-Sandbox). The module delivers learning materials on Cloud-based Consensus Pathway Analysis in an interactive format that uses appropriate cloud resources for data access and analyses. Pathway analysis is important because it allows us to gain insights into biological mechanisms underlying conditions. But the availability of many pathway analysis methods, the requirement of coding skills, and the focus of current tools on only a few species all make it very difficult for biomedical researchers to self-learn and perform pathway analysis efficiently. Furthermore, there is a lack of tools that allow researchers to compare analysis results obtained from different experiments and different analysis methods to find consensus results. To address these challenges, we have designed a cloud-based, self-learning module that provides consensus results among established, state-of-the-art pathway analysis techniques to provide students and researchers with necessary training and example materials. The training module consists of five Jupyter Notebooks that provide complete tutorials for the following tasks: (i) process expression data, (ii) perform differential analysis, visualize and compare the results obtained from four differential analysis methods (limma, t-test, edgeR, DESeq2), (iii) process three pathway databases (GO, KEGG and Reactome), (iv) perform pathway analysis using eight methods (ORA, CAMERA, KS test, Wilcoxon test, FGSEA, GSA, SAFE and PADOG) and (v) combine results of multiple analyses. We also provide examples, source code, explanations and instructional videos for trainees to complete each Jupyter Notebook. The module supports the analysis for many model (e.g. human, mouse, fruit fly, zebra fish) and non-model species. The module is publicly available at https://github.com/NIGMS/Consensus-Pathway-Analysis-in-the-Cloud. This manuscript describes the development of a resource module that is part of a learning platform named ``NIGMS Sandbox for Cloud-based Learning'' https://github.com/NIGMS/NIGMS-Sandbox. The overall genesis of the Sandbox is described in the editorial NIGMS Sandbox [1] at the beginning of this Supplement. This module delivers learning materials on the analysis of bulk and single-cell ATAC-seq data in an interactive format that uses appropriate cloud resources for data access and analyses.

Assuntos

Computação em Nuvem , Software , Humanos , Biologia Computacional/métodos , Biologia Computacional/educação , Animais , Ontologia Genética

7.

Whole-genome bisulfite sequencing data analysis learning module on Google Cloud Platform.

Qin, Yujia; Maggio, Angela; Hawkins, Dale; Beaudry, Laura; Kim, Allen; Pan, Daniel; Gong, Ting; Fu, Yuanyuan; Yang, Hua; Deng, Youping.

Brief Bioinform ; 25(Supplement_1)2024 Jul 23.

Artigo em Inglês | MEDLINE | ID: mdl-39041913

RESUMO

This study describes the development of a resource module that is part of a learning platform named 'NIGMS Sandbox for Cloud-based Learning' https://github.com/NIGMS/NIGMS-Sandbox. The overall genesis of the Sandbox is described in the editorial NIGMS Sandbox at the beginning of this Supplement. This module is designed to facilitate interactive learning of whole-genome bisulfite sequencing (WGBS) data analysis utilizing cloud-based tools in Google Cloud Platform, such as Cloud Storage, Vertex AI notebooks and Google Batch. WGBS is a powerful technique that can provide comprehensive insights into DNA methylation patterns at single cytosine resolution, essential for understanding epigenetic regulation across the genome. The designed learning module first provides step-by-step tutorials that guide learners through two main stages of WGBS data analysis, preprocessing and the identification of differentially methylated regions. And then, it provides a streamlined workflow and demonstrates how to effectively use it for large datasets given the power of cloud infrastructure. The integration of these interconnected submodules progressively deepens the user's understanding of the WGBS analysis process along with the use of cloud resources. Through this module, we can enhance the accessibility and adoption of cloud computing in epigenomic research, speeding up the advancements in the related field and beyond. This manuscript describes the development of a resource module that is part of a learning platform named ``NIGMS Sandbox for Cloud-based Learning'' https://github.com/NIGMS/NIGMS-Sandbox. The overall genesis of the Sandbox is described in the editorial NIGMS Sandbox [1] at the beginning of this Supplement. This module delivers learning materials on the analysis of bulk and single-cell ATAC-seq data in an interactive format that uses appropriate cloud resources for data access and analyses.

Assuntos

Computação em Nuvem , Metilação de DNA , Software , Sequenciamento Completo do Genoma , Sequenciamento Completo do Genoma/métodos , Sulfitos/química , Humanos , Epigênese Genética , Biologia Computacional/métodos

8.

Understanding proteome quantification in an interactive learning module on Google Cloud Platform.

O'Connell, Kyle A; Kopchick, Benjamin; Carlson, Thad; Belardo, David; Byrum, Stephanie D.

Brief Bioinform ; 25(Supplement_1)2024 Jul 23.

Artigo em Inglês | MEDLINE | ID: mdl-39041914

RESUMO

This manuscript describes the development of a resource module that is part of a learning platform named 'NIGMS Sandbox for Cloud-based Learning' https://github.com/NIGMS/NIGMS-Sandbox. The overall genesis of the Sandbox is described in the editorial NIGMS Sandbox at the beginning of this Supplement. This module delivers learning materials on protein quantification in an interactive format that uses appropriate cloud resources for data access and analyses. Quantitative proteomics is a rapidly growing discipline due to the cutting-edge technologies of high resolution mass spectrometry. There are many data types to consider for proteome quantification including data dependent acquisition, data independent acquisition, multiplexing with Tandem Mass Tag reporter ions, spectral counts, and more. As part of the NIH NIGMS Sandbox effort, we developed a learning module to introduce students to mass spectrometry terminology, normalization methods, statistical designs, and basics of R programming. By utilizing the Google Cloud environment, the learning module is easily accessible without the need for complex installation procedures. The proteome quantification module demonstrates the analysis using a provided TMT10plex data set using MS3 reporter ion intensity quantitative values in a Jupyter notebook with an R kernel. The learning module begins with the raw intensities, performs normalization, and differential abundance analysis using limma models, and is designed for researchers with a basic understanding of mass spectrometry and R programming language. Learners walk away with a better understanding of how to navigate Google Cloud Platform for proteomic research, and with the basics of mass spectrometry data analysis at the command line. This manuscript describes the development of a resource module that is part of a learning platform named ``NIGMS Sandbox for Cloud-based Learning'' https://github.com/NIGMS/NIGMS-Sandbox. The overall genesis of the Sandbox is described in the editorial NIGMS Sandbox [1] at the beginning of this Supplement. This module delivers learning materials on the analysis of bulk and single-cell ATAC-seq data in an interactive format that uses appropriate cloud resources for data access and analyses.

Assuntos

Computação em Nuvem , Proteoma , Proteômica , Software , Proteoma/metabolismo , Proteômica/métodos , Espectrometria de Massas , Humanos

9.

Identifying and training deep learning neural networks on biomedical-related datasets.

Woessner, Alan E; Anjum, Usman; Salman, Hadi; Lear, Jacob; Turner, Jeffrey T; Campbell, Ross; Beaudry, Laura; Zhan, Justin; Cornett, Lawrence E; Gauch, Susan; Quinn, Kyle P.

Brief Bioinform ; 25(Supplement_1)2024 Jul 23.

Artigo em Inglês | MEDLINE | ID: mdl-39041915

RESUMO

This manuscript describes the development of a resources module that is part of a learning platform named 'NIGMS Sandbox for Cloud-based Learning' https://github.com/NIGMS/NIGMS-Sandbox. The overall genesis of the Sandbox is described in the editorial NIGMS Sandbox at the beginning of this Supplement. This module delivers learning materials on implementing deep learning algorithms for biomedical image data in an interactive format that uses appropriate cloud resources for data access and analyses. Biomedical-related datasets are widely used in both research and clinical settings, but the ability for professionally trained clinicians and researchers to interpret datasets becomes difficult as the size and breadth of these datasets increases. Artificial intelligence, and specifically deep learning neural networks, have recently become an important tool in novel biomedical research. However, use is limited due to their computational requirements and confusion regarding different neural network architectures. The goal of this learning module is to introduce types of deep learning neural networks and cover practices that are commonly used in biomedical research. This module is subdivided into four submodules that cover classification, augmentation, segmentation and regression. Each complementary submodule was written on the Google Cloud Platform and contains detailed code and explanations, as well as quizzes and challenges to facilitate user training. Overall, the goal of this learning module is to enable users to identify and integrate the correct type of neural network with their data while highlighting the ease-of-use of cloud computing for implementing neural networks. This manuscript describes the development of a resource module that is part of a learning platform named ``NIGMS Sandbox for Cloud-based Learning'' https://github.com/NIGMS/NIGMS-Sandbox. The overall genesis of the Sandbox is described in the editorial NIGMS Sandbox [1] at the beginning of this Supplement. This module delivers learning materials on the analysis of bulk and single-cell ATAC-seq data in an interactive format that uses appropriate cloud resources for data access and analyses.

Assuntos

Aprendizado Profundo , Redes Neurais de Computação , Humanos , Pesquisa Biomédica , Algoritmos , Computação em Nuvem

10.

A data science roadmap for open science organizations engaged in early-stage drug discovery.

Edfeldt, Kristina; Edwards, Aled M; Engkvist, Ola; Günther, Judith; Hartley, Matthew; Hulcoop, David G; Leach, Andrew R; Marsden, Brian D; Menge, Amelie; Misquitta, Leonie; Müller, Susanne; Owen, Dafydd R; Schütt, Kristof T; Skelton, Nicholas; Steffen, Andreas; Tropsha, Alexander; Vernet, Erik; Wang, Yanli; Wellnitz, James; Willson, Timothy M; Clevert, Djork-Arné; Haibe-Kains, Benjamin; Schiavone, Lovisa Holmberg; Schapira, Matthieu.

Nat Commun ; 15(1): 5640, 2024 Jul 05.

Artigo em Inglês | MEDLINE | ID: mdl-38965235

RESUMO

The Structural Genomics Consortium is an international open science research organization with a focus on accelerating early-stage drug discovery, namely hit discovery and optimization. We, as many others, believe that artificial intelligence (AI) is poised to be a main accelerator in the field. The question is then how to best benefit from recent advances in AI and how to generate, format and disseminate data to enable future breakthroughs in AI-guided drug discovery. We present here the recommendations of a working group composed of experts from both the public and private sectors. Robust data management requires precise ontologies and standardized vocabulary while a centralized database architecture across laboratories facilitates data integration into high-value datasets. Lab automation and opening electronic lab notebooks to data mining push the boundaries of data sharing and data modeling. Important considerations for building robust machine-learning models include transparent and reproducible data processing, choosing the most relevant data representation, defining the right training and test sets, and estimating prediction uncertainty. Beyond data-sharing, cloud-based computing can be harnessed to build and disseminate machine-learning models. Important vectors of acceleration for hit and chemical probe discovery will be (1) the real-time integration of experimental data generation and modeling workflows within design-make-test-analyze (DMTA) cycles openly, and at scale and (2) the adoption of a mindset where data scientists and experimentalists work as a unified team, and where data science is incorporated into the experimental design.

Assuntos

Ciência de Dados , Descoberta de Drogas , Aprendizado de Máquina , Descoberta de Drogas/métodos , Ciência de Dados/métodos , Humanos , Inteligência Artificial , Disseminação de Informação/métodos , Mineração de Dados/métodos , Computação em Nuvem , Bases de Dados Factuais

11.

Reusable tutorials for using cloud-based computing environments for the analysis of bacterial gene expression data from bulk RNA sequencing.

Allers, Steven; O'Connell, Kyle A; Carlson, Thad; Belardo, David; King, Benjamin L.

Brief Bioinform ; 25(4)2024 May 23.

Artigo em Inglês | MEDLINE | ID: mdl-38997128

RESUMO

This manuscript describes the development of a resource module that is part of a learning platform named "NIGMS Sandbox for Cloud-based Learning" https://github.com/NIGMS/NIGMS-Sandbox. The overall genesis of the Sandbox is described in the editorial NIGMS Sandbox at the beginning of this Supplement. This module delivers learning materials on RNA sequencing (RNAseq) data analysis in an interactive format that uses appropriate cloud resources for data access and analyses. Biomedical research is increasingly data-driven, and dependent upon data management and analysis methods that facilitate rigorous, robust, and reproducible research. Cloud-based computing resources provide opportunities to broaden the application of bioinformatics and data science in research. Two obstacles for researchers, particularly those at small institutions, are: (i) access to bioinformatics analysis environments tailored to their research; and (ii) training in how to use Cloud-based computing resources. We developed five reusable tutorials for bulk RNAseq data analysis to address these obstacles. Using Jupyter notebooks run on the Google Cloud Platform, the tutorials guide the user through a workflow featuring an RNAseq dataset from a study of prophage altered drug resistance in Mycobacterium chelonae. The first tutorial uses a subset of the data so users can learn analysis steps rapidly, and the second uses the entire dataset. Next, a tutorial demonstrates how to analyze the read count data to generate lists of differentially expressed genes using R/DESeq2. Additional tutorials generate read counts using the Snakemake workflow manager and Nextflow with Google Batch. All tutorials are open-source and can be used as templates for other analysis.

Assuntos

Computação em Nuvem , Biologia Computacional , Análise de Sequência de RNA , Software , Biologia Computacional/métodos , Análise de Sequência de RNA/métodos , Regulação Bacteriana da Expressão Gênica

12.

Variability in wet and dry snow radar zones in the North of the Antarctic Peninsula using a cloud computing environment.

Idalino, Filipe D; Rosa, Kátia K DA; Hillebrand, Fernando L; Arigony-Neto, Jorge; Mendes, Claudio Wilson; Simões, Jefferson C.

An Acad Bras Cienc ; 96(suppl 2): e20230704, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-39016361

RESUMO

This work investigated the annual variations in dry snow (DSRZ) and wet snow radar zones (WSRZ) in the north of the Antarctic Peninsula between 2015-2023. A specific code for snow zone detection on Sentinel-1 images was created on Google Earth Engine by combining the CryoSat-2 digital elevation model and air temperature data from ERA5. Regions with backscatter coefficients (σ°) values exceeding -6.5 dB were considered the extent of surface melt occurrence, and the dry snow line was considered to coincide with the -11 °C isotherm of the average annual air temperature. The annual variation in WSRZ exhibited moderate correlations with annual average air temperature, total precipitation, and the sum of annual degree-days. However, statistical tests indicated low determination coefficients and no significant trend values in DSRZ behavior with atmospheric variables. The results of reducing DSRZ area for 2019/2020 and 2020/2021 compared to 2018/2018 indicated the upward in dry zone line in this AP region. The methodology demonstrated its efficacy for both quantitative and qualitative analyses of data obtained in digital processing environments, allowing for the large-scale spatial and temporal variations monitoring and for the understanding changes in glacier mass loss.

Assuntos

Computação em Nuvem , Radar , Neve , Regiões Antárticas , Estações do Ano , Monitoramento Ambiental/métodos , Temperatura

13.

The Flux Operator.

Sochat, Vanessa; Culquicondor, Aldo; Ojea, Antonio; Milroy, Daniel.

F1000Res ; 13: 203, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38868668

RESUMO

Converged computing is an emerging area of computing that brings together the best of both worlds for high performance computing (HPC) and cloud-native communities. The economic influence of cloud computing and the need for workflow portability, flexibility, and manageability are driving this emergence. Navigating the uncharted territory and building an effective space for both HPC and cloud require collaborative technological development and research. In this work, we focus on developing components for the converged workload manager, the central component of batch workflows running in any environment. From the cloud we base our work on Kubernetes, the de facto standard batch workload orchestrator. From HPC the orchestrator counterpart is Flux Framework, a fully hierarchical resource management and graph-based scheduler with a modular architecture that supports sophisticated scheduling and job management. Bringing these managers together consists of implementing Flux inside of Kubernetes, enabling hierarchical resource management and scheduling that scales without burdening the Kubernetes scheduler. This paper introduces the Flux Operator - an on-demand HPC workload manager deployed in Kubernetes. Our work describes design decisions, mapping components between environments, and experimental features. We perform experiments that compare application performance when deployed by the Flux Operator and the MPI Operator and present the results. Finally, we review remaining challenges and describe our vision of the future for improved technological innovation and collaboration through converged computing.

Assuntos

Computação em Nuvem , Carga de Trabalho , Fluxo de Trabalho

14.

Effectiveness of remote risk-based monitoring and potential benefits for combination with direct data capture.

Yamada, Osamu; Chiu, Shih-Wei; Nakazawa, Toru; Tsuda, Satoru; Yoshida, Mitsuhide; Asano, Toshifumi; Kokubun, Taiki; Hashimoto, Kazuki; Takata, Munenori; Ikeda, Suzuka; Kawabe, Yosuke; Tamura, Yuko; Yamaguchi, Takuhiro.

Trials ; 25(1): 384, 2024 Jun 14.

Artigo em Inglês | MEDLINE | ID: mdl-38877566

RESUMO

BACKGROUND: In recent years, alternative monitoring approaches, such as risk-based and remote monitoring techniques, have been recommended instead of traditional on-site monitoring to achieve more efficient monitoring. Remote risk-based monitoring (R2BM) is a monitoring technique that combines risk-based and remote monitoring and focuses on the detection of critical data and process errors. Direct data capture (DDC), which directly collects electronic source data, can facilitate R2BM by minimizing the extent of source documents that must be reviewed and reducing the additional workload on R2BM. In this study, we evaluated the effectiveness of R2BM and the synergistic effect of combining R2BM with DDC. METHODS: R2BM was prospectively conducted with eight participants in a randomized clinical trial using a remote monitoring system that uploaded photographs of source documents to a cloud location. Critical data and processes were verified by R2BM, and later, all were confirmed by on-site monitoring to evaluate the ability of R2BM to detect critical data and process errors and the workload of uploading photographs for clinical trial staff. In addition, the reduction of the number of uploaded photographs was evaluated by assuming that the DDC was introduced for data collection. RESULTS: Of the 4645 data points, 20.9% (n = 973, 95% confidence interval = 19.8-22.2) were identified as critical. All critical data errors corresponding to 5.4% (n = 53/973, 95% confidence interval = 4.1-7.1) of the critical data and critical process errors were detectable by R2BM. The mean number of uploaded photographs and the mean time to upload them per visit per participant were 34.4 ± 11.9 and 26.5 ± 11.8 min (mean ± standard deviation), respectively. When assuming that DDC was introduced for data collection, 45.0% (95% confidence interval = 42.2-47.9) of uploaded photographs for R2BM were reduced. CONCLUSIONS: R2BM can detect 100% of the critical data and process errors without on-site monitoring. Combining R2BM with DDC reduces the workload of R2BM and further improves its efficiency.

Assuntos

Fotografação , Humanos , Estudos Prospectivos , Medição de Risco , Carga de Trabalho , Computação em Nuvem , Coleta de Dados/métodos , Feminino , Masculino , Confiabilidade dos Dados , Projetos de Pesquisa

15.

Evaluation of comprehensive early warning for higher education institutions' cloud model of simulated enterprise management cockpit.

Fan, Weixing; Li, Zhifeng.

PLoS One ; 19(6): e0305652, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38889112

RESUMO

The cross-disciplinary virtually simulated platform for enterprise management in universities' new business courses is an initiative practical framework based on scenario-driven tasks. However, there is a prominent conflict between the rapid operating cycle of simulation enterprises plus their fierce competitions and the strategic demand for real-time analysis for operational data. Based on such demand, this study takes the development method of the simulated enterprise management cockpit from Guangzhou Huashang College as an example. It adopts the combined weighting method based on cloud models to determine indicator weights, then qualitative and quantitative data analyses are conducted from five aspects: "business, finance and operation", "customer management and marketing", "internal operational objectives", "product development strategy", along with "team building and management". This approach achieves a comprehensive evaluation and early warning of the enterprise management process. Specifically, the subjective weights are determined by the Analytic Hierarchy Process, while the objective weights by the entropy weight method, finally verified by cloud model evaluation of its overall indicator performance. The design can evaluate the comprehensive performance of enterprise management indicators and students' activity participation through the cloud-based application and the digital cockpit, so as to fully presents the enterprise's overall management level, along with judgement of whether it is reasonable through pointers in different colors. In addition, apparent indicator-related characteristics are also utilized to assess future decision-making directions. Finally, this comprehensive approach can timely optimize operation strategies and facilitate budget allocation for future development.

Assuntos

Computação em Nuvem , Universidades , Humanos , Simulação por Computador , Modelos Teóricos , Comércio

16.

Security Analysis for Smart Healthcare Systems.

Ibrahim, Mariam; Al-Wadi, Abdallah; Elhafiz, Ruba.

Sensors (Basel) ; 24(11)2024 May 24.

Artigo em Inglês | MEDLINE | ID: mdl-38894166

RESUMO

The healthcare industry went through reformation by integrating the Internet of Medical Things (IoMT) to enable data harnessing by transmission mediums from different devices, about patients to healthcare staff devices, for further analysis through cloud-based servers for proper diagnosis of patients, yielding efficient and accurate results. However, IoMT technology is accompanied by a set of drawbacks in terms of security risks and vulnerabilities, such as violating and exposing patients' sensitive and confidential data. Further, the network traffic data is prone to interception attacks caused by a wireless type of communication and alteration of data, which could cause unwanted outcomes. The advocated scheme provides insight into a robust Intrusion Detection System (IDS) for IoMT networks. It leverages a honeypot to divert attackers away from critical systems, reducing the attack surface. Additionally, the IDS employs an ensemble method combining Logistic Regression and K-Nearest Neighbor algorithms. This approach harnesses the strengths of both algorithms to improve attack detection accuracy and robustness. This work analyzes the impact, performance, accuracy, and precision outcomes of the used model on two IoMT-related datasets which contain multiple attack types such as Man-In-The-Middle (MITM), Data Injection, and Distributed Denial of Services (DDOS). The yielded results showed that the proposed ensemble method was effective in detecting intrusion attempts and classifying them as attacks or normal network traffic, with a high accuracy of 92.5% for the first dataset and 99.54% for the second dataset and a precision of 96.74% for the first dataset and 99.228% for the second dataset.

Assuntos

Algoritmos , Segurança Computacional , Atenção à Saúde , Internet das Coisas , Humanos , Tecnologia sem Fio , Computação em Nuvem , Confidencialidade

17.

A cloud-based training module for efficient de novo transcriptome assembly using Nextflow and Google cloud.

Seaman, Ryan P; Campbell, Ross; Doe, Valena; Yosufzai, Zelaikha; Graber, Joel H.

Brief Bioinform ; 25(4)2024 May 23.

Artigo em Inglês | MEDLINE | ID: mdl-38941113

RESUMO

This study describes the development of a resource module that is part of a learning platform named "NIGMS Sandbox for Cloud-based Learning" (https://github.com/NIGMS/NIGMS-Sandbox). The overall genesis of the Sandbox is described in the editorial NIGMS Sandbox at the beginning of this Supplement. This module delivers learning materials on de novo transcriptome assembly using Nextflow in an interactive format that uses appropriate cloud resources for data access and analysis. Cloud computing is a powerful new means by which biomedical researchers can access resources and capacity that were previously either unattainable or prohibitively expensive. To take advantage of these resources, however, the biomedical research community needs new skills and knowledge. We present here a cloud-based training module, developed in conjunction with Google Cloud, Deloitte Consulting, and the NIH STRIDES Program, that uses the biological problem of de novo transcriptome assembly to demonstrate and teach the concepts of computational workflows (using Nextflow) and cost- and resource-efficient use of Cloud services (using Google Cloud Platform). Our work highlights the reduced necessity of on-site computing resources and the accessibility of cloud-based infrastructure for bioinformatics applications.

Assuntos

Computação em Nuvem , Transcriptoma , Biologia Computacional/métodos , Biologia Computacional/educação , Software , Humanos , Perfilação da Expressão Gênica/métodos , Internet

18.

MS-PyCloud: A Cloud Computing-Based Pipeline for Proteomic and Glycoproteomic Data Analyses.

Hu, Yingwei; Schnaubelt, Michael; Chen, Li; Zhang, Bai; Hoang, Trung; Lih, T Mamie; Zhang, Zhen; Zhang, Hui.

Anal Chem ; 96(25): 10145-10151, 2024 Jun 25.

Artigo em Inglês | MEDLINE | ID: mdl-38869158

RESUMO

Rapid development and wide adoption of mass spectrometry-based glycoproteomic technologies have empowered scientists to study proteins and protein glycosylation in complex samples on a large scale. This progress has also created unprecedented challenges for individual laboratories to store, manage, and analyze proteomic and glycoproteomic data, both in the cost for proprietary software and high-performance computing and in the long processing time that discourages on-the-fly changes of data processing settings required in explorative and discovery analysis. We developed an open-source, cloud computing-based pipeline, MS-PyCloud, with graphical user interface (GUI), for proteomic and glycoproteomic data analysis. The major components of this pipeline include data file integrity validation, MS/MS database search for spectral assignments to peptide sequences, false discovery rate estimation, protein inference, quantitation of global protein levels, and specific glycan-modified glycopeptides as well as other modification-specific peptides such as phosphorylation, acetylation, and ubiquitination. To ensure the transparency and reproducibility of data analysis, MS-PyCloud includes open-source software tools with comprehensive testing and versioning for spectrum assignments. Leveraging public cloud computing infrastructure via Amazon Web Services (AWS), MS-PyCloud scales seamlessly based on analysis demand to achieve fast and efficient performance. Application of the pipeline to the analysis of large-scale LC-MS/MS data sets demonstrated the effectiveness and high performance of MS-PyCloud. The software can be downloaded at https://github.com/huizhanglab-jhu/ms-pycloud.

Assuntos

Proteômica , Proteômica/métodos , Software , Espectrometria de Massas em Tandem/métodos , Computação em Nuvem , Glicoproteínas/análise , Humanos

19.

E2SVM: Electricity-Efficient SLA-aware Virtual Machine Consolidation approach in cloud data centers.

Kumar, Vaneet; Ali, Aleem; Mittal, Payal; Aqeel, Ibrahim; Shuaib, Mohammed; Alam, Shadab; Aalsalem, Mohammed Y.

PLoS One ; 19(6): e0303313, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38857300

RESUMO

Cloud data centers present a challenge to environmental sustainability because of their significant energy consumption. Additionally, performance degradation resulting from energy management solutions, such as virtual machine (VM) consolidation, impacts service level agreements (SLAs) between cloud service providers and users. Thus, to achieve a balance between efficient energy consumption and avoiding SLA violations, we propose a novel VM consolidation algorithm. Conventional algorithms result in unnecessary migrations when improperly selecting VMs. Therefore, our proposed E2SVM algorithm addresses this issue by selecting VMs with high load fluctuations and minimal resource usage from overloaded servers. These selected VMs are then placed on normally loaded servers, considering their stability index. Moreover, our approach prevents server underutilization by either applying all or no VM migrations. Simulation results demonstrate a 12.9% decrease in maximum energy consumption compared with the minimum migration time VM selection policy. In addition, a 47% reduction in SLA violations was observed when using the medium absolute deviation as the overload detection policy. Therefore, this approach holds promise for real-world data centers because it minimizes energy waste and maintains low SLA violations.

Assuntos

Algoritmos , Computação em Nuvem , Eletricidade

20.

Lightning Pose: improved animal pose estimation via semi-supervised learning, Bayesian ensembling and cloud-native open-source tools.

Biderman, Dan; Whiteway, Matthew R; Hurwitz, Cole; Greenspan, Nicholas; Lee, Robert S; Vishnubhotla, Ankit; Warren, Richard; Pedraja, Federico; Noone, Dillon; Schartner, Michael M; Huntenburg, Julia M; Khanal, Anup; Meijer, Guido T; Noel, Jean-Paul; Pan-Vazquez, Alejandro; Socha, Karolina Z; Urai, Anne E; Cunningham, John P; Sawtell, Nathaniel B; Paninski, Liam.

Nat Methods ; 21(7): 1316-1328, 2024 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-38918605

RESUMO

Contemporary pose estimation methods enable precise measurements of behavior via supervised deep learning with hand-labeled video frames. Although effective in many cases, the supervised approach requires extensive labeling and often produces outputs that are unreliable for downstream analyses. Here, we introduce 'Lightning Pose', an efficient pose estimation package with three algorithmic contributions. First, in addition to training on a few labeled video frames, we use many unlabeled videos and penalize the network whenever its predictions violate motion continuity, multiple-view geometry and posture plausibility (semi-supervised learning). Second, we introduce a network architecture that resolves occlusions by predicting pose on any given frame using surrounding unlabeled frames. Third, we refine the pose predictions post hoc by combining ensembling and Kalman smoothing. Together, these components render pose trajectories more accurate and scientifically usable. We released a cloud application that allows users to label data, train networks and process new videos directly from the browser.

Assuntos

Algoritmos , Teorema de Bayes , Gravação em Vídeo , Animais , Gravação em Vídeo/métodos , Aprendizado de Máquina Supervisionado , Computação em Nuvem , Software , Postura/fisiologia , Aprendizado Profundo , Processamento de Imagem Assistida por Computador/métodos , Comportamento Animal

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA