Pesquisa | BVS Integralidade em Saúde

The Carbon Footprint of Bioinformatics.

Grealey, Jason; Lannelongue, Loïc; Saw, Woei-Yuh; Marten, Jonathan; Méric, Guillaume; Ruiz-Carmona, Sergio; Inouye, Michael.

Mol Biol Evol ; 39(3)2022 03 02.

Artigo em Inglês | MEDLINE | ID: mdl-35143670

RESUMO

Bioinformatic research relies on large-scale computational infrastructures which have a nonzero carbon footprint but so far, no study has quantified the environmental costs of bioinformatic tools and commonly run analyses. In this work, we estimate the carbon footprint of bioinformatics (in kilograms of CO2 equivalent units, kgCO2e) using the freely available Green Algorithms calculator (www.green-algorithms.org, last accessed 2022). We assessed 1) bioinformatic approaches in genome-wide association studies (GWAS), RNA sequencing, genome assembly, metagenomics, phylogenetics, and molecular simulations, as well as 2) computation strategies, such as parallelization, CPU (central processing unit) versus GPU (graphics processing unit), cloud versus local computing infrastructure, and geography. In particular, we found that biobank-scale GWAS emitted substantial kgCO2e and simple software upgrades could make it greener, for example, upgrading from BOLT-LMM v1 to v2.3 reduced carbon footprint by 73%. Moreover, switching from the average data center to a more efficient one can reduce carbon footprint by approximately 34%. Memory over-allocation can also be a substantial contributor to an algorithm's greenhouse gas emissions. The use of faster processors or greater parallelization reduces running time but can lead to greater carbon footprint. Finally, we provide guidance on how researchers can reduce power consumption and minimize kgCO2e. Overall, this work elucidates the carbon footprint of common analyses in bioinformatics and provides solutions which empower a move toward greener research.

Assuntos

Pegada de Carbono , Biologia Computacional , Algoritmos , Estudo de Associação Genômica Ampla , Software

Ten simple rules to make your computing more environmentally sustainable.

Lannelongue, Loïc; Grealey, Jason; Bateman, Alex; Inouye, Michael.

PLoS Comput Biol ; 17(9): e1009324, 2021 09.

Artigo em Inglês | MEDLINE | ID: mdl-34543272

Assuntos

Computadores , Conservação dos Recursos Naturais , Guias como Assunto , Tecnologia da Informação , Dióxido de Carbono/análise , Mudança Climática , Resíduo Eletrônico , Humanos , Reciclagem

Machine learning optimized polygenic scores for blood cell traits identify sex-specific trajectories and genetic correlations with disease.

Xu, Yu; Vuckovic, Dragana; Ritchie, Scott C; Akbari, Parsa; Jiang, Tao; Grealey, Jason; Butterworth, Adam S; Ouwehand, Willem H; Roberts, David J; Di Angelantonio, Emanuele; Danesh, John; Soranzo, Nicole; Inouye, Michael.

Cell Genom ; 2(1): None, 2022 Jan 12.

Artigo em Inglês | MEDLINE | ID: mdl-35072137

RESUMO

Genetic association studies for blood cell traits, which are key indicators of health and immune function, have identified several hundred associations and defined a complex polygenic architecture. Polygenic scores (PGSs) for blood cell traits have potential clinical utility in disease risk prediction and prevention, but designing PGS remains challenging and the optimal methods are unclear. To address this, we evaluated the relative performance of 6 methods to develop PGS for 26 blood cell traits, including a standard method of pruning and thresholding (P + T) and 5 learning methods: LDpred2, elastic net (EN), Bayesian ridge (BR), multilayer perceptron (MLP) and convolutional neural network (CNN). We evaluated these optimized PGSs on blood cell trait data from UK Biobank and INTERVAL. We find that PGSs designed using common machine learning methods EN and BR show improved prediction of blood cell traits and consistently outperform other methods. Our analyses suggest EN/BR as the top choices for PGS construction, showing improved performance for 25 blood cell traits in the external validation, with correlations with the directly measured traits increasing by 10%-23%. Ten PGSs showed significant statistical interaction with sex, and sex-specific PGS stratification showed that all of them had substantial variation in the trajectories of blood cell traits with age. Genetic correlations between the PGSs for blood cell traits and common human diseases identified well-known as well as new associations. We develop machine learning-optimized PGS for blood cell traits, demonstrate their relationships with sex, age, and disease, and make these publicly available as a resource.

Green Algorithms: Quantifying the Carbon Footprint of Computation.

Lannelongue, Loïc; Grealey, Jason; Inouye, Michael.

Adv Sci (Weinh) ; 8(12): 2100707, 2021 06.

Artigo em Inglês | MEDLINE | ID: mdl-34194954

RESUMO

Climate change is profoundly affecting nearly all aspects of life on earth, including human societies, economies, and health. Various human activities are responsible for significant greenhouse gas (GHG) emissions, including data centers and other sources of large-scale computation. Although many important scientific milestones are achieved thanks to the development of high-performance computing, the resultant environmental impact is underappreciated. In this work, a methodological framework to estimate the carbon footprint of any computational task in a standardized and reliable way is presented and metrics to contextualize GHG emissions are defined. A freely available online tool, Green Algorithms (www.green-algorithms.org) is developed, which enables a user to estimate and report the carbon footprint of their computation. The tool easily integrates with computational processes as it requires minimal information and does not interfere with existing code, while also accounting for a broad range of hardware configurations. Finally, the GHG emissions of algorithms used for particle physics simulations, weather forecasts, and natural language processing are quantified. Taken together, this study develops a simple generalizable framework and freely available tool to quantify the carbon footprint of nearly any computation. Combined with recommendations to minimize unnecessary CO2 emissions, the authors hope to raise awareness and facilitate greener computation.

RESUMO

Assuntos

Assuntos

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

Detalhe da pesquisa