Búsqueda | Portal de Búsqueda de la BVS

Rcupcake: an R package for querying and analyzing biomedical data through the BD2K PIC-SURE RESTful API.

Gutiérrez-Sacristán, Alba; Guedj, Romain; Korodi, Gabor; Stedman, Jason; Furlong, Laura I; Patel, Chirag J; Kohane, Isaac S; Avillach, Paul.

Bioinformatics ; 34(8): 1431-1432, 2018 04 15.

Artículo en Inglés | MEDLINE | ID: mdl-29267850

RESUMEN

Motivation: In the era of big data and precision medicine, the number of databases containing clinical, environmental, self-reported and biochemical variables is increasing exponentially. Enabling the experts to focus on their research questions rather than on computational data management, access and analysis is one of the most significant challenges nowadays. Results: We present Rcupcake, an R package that contains a variety of functions for leveraging different databases through the BD2K PIC-SURE RESTful API and facilitating its query, analysis and interpretation. The package offers a variety of analysis and visualization tools, including the study of the phenotype co-occurrence and prevalence, according to multiple layers of data, such as phenome, exposome or genome. Availability and implementation: The package is implemented in R and is available under Mozilla v2 license from GitHub (https://github.com/hms-dbmi/Rcupcake). Two reproducible case studies are also available (https://github.com/hms-dbmi/Rcupcake-case-studies/blob/master/SSCcaseStudy_v01.ipynb, https://github.com/hms-dbmi/Rcupcake-case-studies/blob/master/NHANEScaseStudy_v01.ipynb). Contact: paul_avillach@hms.harvard.edu. Supplementary information: Supplementary data are available at Bioinformatics online.

Asunto(s)

Biología Computacional/métodos , Genoma Humano , Fenotipo , Medicina de Precisión , Programas Informáticos , Bases de Datos Factuales , Humanos

Scalability and cost-effectiveness analysis of whole genome-wide association studies on Google Cloud Platform and Amazon Web Services.

Krissaane, Inès; De Niz, Carlos; Gutiérrez-Sacristán, Alba; Korodi, Gabor; Ede, Nneka; Kumar, Ranjay; Lyons, Jessica; Manrai, Arjun; Patel, Chirag; Kohane, Isaac; Avillach, Paul.

J Am Med Inform Assoc ; 27(9): 1425-1430, 2020 09 01.

Artículo en Inglés | MEDLINE | ID: mdl-32719837

RESUMEN

OBJECTIVE: Advancements in human genomics have generated a surge of available data, fueling the growth and accessibility of databases for more comprehensive, in-depth genetic studies. METHODS: We provide a straightforward and innovative methodology to optimize cloud configuration in order to conduct genome-wide association studies. We utilized Spark clusters on both Google Cloud Platform and Amazon Web Services, as well as Hail (http://doi.org/10.5281/zenodo.2646680) for analysis and exploration of genomic variants dataset. RESULTS: Comparative evaluation of numerous cloud-based cluster configurations demonstrate a successful and unprecedented compromise between speed and cost for performing genome-wide association studies on 4 distinct whole-genome sequencing datasets. Results are consistent across the 2 cloud providers and could be highly useful for accelerating research in genetics. CONCLUSIONS: We present a timely piece for one of the most frequently asked questions when moving to the cloud: what is the trade-off between speed and cost?

Asunto(s)

Nube Computacional , Estudio de Asociación del Genoma Completo , Nube Computacional/economía , Redes de Comunicación de Computadores , Análisis Costo-Beneficio , Estudio de Asociación del Genoma Completo/economía , Estudio de Asociación del Genoma Completo/métodos , Genómica/métodos , Humanos

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA