Pesquisa | Portal Regional da BVS

1.

Packaging and containerization of computational methods.

Alser, Mohammed; Lawlor, Brendan; Abdill, Richard J; Waymost, Sharon; Ayyala, Ram; Rajkumar, Neha; LaPierre, Nathan; Brito, Jaqueline; Ribeiro-Dos-Santos, André M; Almadhoun, Nour; Sarwal, Varuni; Firtina, Can; Osinski, Tomasz; Eskin, Eleazar; Hu, Qiyang; Strong, Derek; Kim, Byoung-Do B D; Abedalthagafi, Malak S; Mutlu, Onur; Mangul, Serghei.

Nat Protoc ; 2024 Apr 02.

Artigo em Inglês | MEDLINE | ID: mdl-38565959

RESUMO

Methods for analyzing the full complement of a biomolecule type, e.g., proteomics or metabolomics, generate large amounts of complex data. The software tools used to analyze omics data have reshaped the landscape of modern biology and become an essential component of biomedical research. These tools are themselves quite complex and often require the installation of other supporting software, libraries and/or databases. A researcher may also be using multiple different tools that require different versions of the same supporting materials. The increasing dependence of biomedical scientists on these powerful tools creates a need for easier installation and greater usability. Packaging and containerization are different approaches to satisfy this need by delivering omics tools already wrapped in additional software that makes the tools easier to install and use. In this systematic review, we describe and compare the features of prominent packaging and containerization platforms. We outline the challenges, advantages and limitations of each approach and some of the most widely used platforms from the perspectives of users, software developers and system administrators. We also propose principles to make the distribution of omics software more sustainable and robust to increase the reproducibility of biomedical and life science research.

2.

The roles of code in biology.

Lawlor, Brendan; Sleator, Roy D.

Sci Prog ; 104(2): 368504211010570, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-33856939

RESUMO

The way in which computer code is perceived and used in biological research has been a source of some controversy and confusion, and has resulted in sub-optimal outcomes related to reproducibility, scalability and productivity. We suggest that the confusion is due in part to a misunderstanding of the function of code when applied to the life sciences. Code has many roles, and in this paper we present a three-dimensional taxonomy to classify those roles and map them specifically to the life sciences. We identify a "sweet spot" in the taxonomy-a convergence where bioinformaticians should concentrate their efforts in order to derive the most value from the time they spend using code. We suggest the use of the "inverse Conway maneuver" to shape a research team so as to allow dedicated software engineers to interface with researchers working in this "sweet spot." We conclude that in order to address current issues in the use of software in life science research such as reproducibility and scalability, the field must reevaluate its relationship with software engineering, and adapt its research structures to overcome current issues in bioinformatics such as reproducibility, scalability and productivity.

Assuntos

Disciplinas das Ciências Biológicas , Software , Biologia Computacional/métodos , Reprodutibilidade dos Testes

3.

The democratization of bioinformatics: A software engineering perspective.

Lawlor, Brendan; Sleator, Roy D.

Gigascience ; 9(6)2020 06 01.

Artigo em Inglês | MEDLINE | ID: mdl-32562490

RESUMO

Today, thanks to advances in cloud computing, it is possible for small teams of software developers to produce internet-scale products, a feat that was previously the preserve of large organizations. Herein, we describe how these advances in software engineering can be made more readily available to bioinformaticians. In the same way that cloud computing has democratized access to distributed systems engineering for generalist software engineers, access to scalable and reproducible bioinformatic engineering can be democratized for generalist bioinformaticians and biologists. We present solutions, based on our own efforts, to achieve this goal.

Assuntos

Biologia Computacional/métodos , Software , Computação em Nuvem , Genômica/métodos

4.

Simplicity DiffExpress: A Bespoke Cloud-Based Interface for RNA-seq Differential Expression Modeling and Analysis.

Palu, Cintia C; Ribeiro-Alves, Marcelo; Wu, Yanxin; Lawlor, Brendan; Baranov, Pavel V; Kelly, Brian; Walsh, Paul.

Front Genet ; 10: 356, 2019.

Artigo em Inglês | MEDLINE | ID: mdl-31139204

RESUMO

One of the key challenges for transcriptomics-based research is not only the processing of large data but also modeling the complexity of features that are sources of variation across samples, which is required for an accurate statistical analysis. Therefore, our goal is to foster access for wet lab researchers to bioinformatics tools, in order to enhance their ability to explore biological aspects and validate hypotheses with robust analysis. In this context, user-friendly interfaces can enable researchers to apply computational biology methods without requiring bioinformatics expertise. Such bespoke platforms can improve the quality of the findings by allowing the researcher to freely explore the data and test a new hypothesis with independence. Simplicity DiffExpress is a data-driven software platform dedicated to enabling non-bioinformaticians to take ownership of the differential expression analysis (DEA) step in a transcriptomics experiment while presenting the results in a comprehensible layout, which supports an efficient results exploration, information storage, and reproducibility. Simplicity DiffExpress' key component is the bespoke statistical model validation that guides the user through any necessary alteration in the dataset or model, tackling the challenges behind complex data analysis. The software utilizes edgeR, and it is implemented as part of the SimplicityTM platform, providing a dynamic interface, with well-organized results that are easy to navigate and are shareable. Computational biologists and bioinformaticians can also benefit from its use since the data validation is more informative than the usual DEA resources. Wet-lab collaborators can benefit from receiving their results in an organized interface. Simplicity DiffExpress is freely available for academic use, and it is cloud-based (https://simplicity.nsilico.com/dea).

5.

Field of genes: using Apache Kafka as a bioinformatic data repository.

Lawlor, Brendan; Lynch, Richard; Mac Aogáin, Micheál; Walsh, Paul.

Gigascience ; 7(4)2018 04 01.

Artigo em Inglês | MEDLINE | ID: mdl-29635394

RESUMO

Background: Bioinformatic research is increasingly dependent on large-scale datasets, accessed either from private or public repositories. An example of a public repository is National Center for Biotechnology Information's (NCBI's) Reference Sequence (RefSeq). These repositories must decide in what form to make their data available. Unstructured data can be put to almost any use but are limited in how access to them can be scaled. Highly structured data offer improved performance for specific algorithms but limit the wider usefulness of the data. We present an alternative: lightly structured data stored in Apache Kafka in a way that is amenable to parallel access and streamed processing, including subsequent transformations into more highly structured representations. We contend that this approach could provide a flexible and powerful nexus of bioinformatic data, bridging the gap between low structure on one hand, and high performance and scale on the other. To demonstrate this, we present a proof-of-concept version of NCBI's RefSeq database using this technology. We measure the performance and scalability characteristics of this alternative with respect to flat files. Results: The proof of concept scales almost linearly as more compute nodes are added, outperforming the standard approach using files. Conclusions: Apache Kafka merits consideration as a fast and more scalable but general-purpose way to store and retrieve bioinformatic data, for public, centralized reference datasets such as RefSeq and for private clinical and experimental data.

Assuntos

Bases de Dados Genéticas , Armazenamento e Recuperação da Informação , Biologia Computacional

6.

Fourteen Draft Genome Sequences for the First Reported Cases of Azithromycin-Resistant Neisseria gonorrhoeae in Ireland.

Mac Aogáin, Micheál; Fennelly, Nicholas; Walsh, Anne; Lynagh, Yvonne; Bekaert, Michaël; Lawlor, Brendan; Walsh, Paul; Kelly, Brian; Rogers, Thomas R; Crowley, Brendan.

Genome Announc ; 5(23)2017 Jun 08.

Artigo em Inglês | MEDLINE | ID: mdl-28596392

RESUMO

Here, we report the draft genome assemblies of 14 azithromycin-resistant Neisseria gonorrhoeae clinical isolates, representing the first such strains identified in Ireland. Among these isolates are the first reported highly resistant strains (MIC >256 mg/liter), which both belonged to the ST1580 sequence type.

7.

Engineering bioinformatics: building reliability, performance and productivity into bioinformatics software.

Lawlor, Brendan; Walsh, Paul.

Bioengineered ; 6(4): 193-203, 2015.

Artigo em Inglês | MEDLINE | ID: mdl-25996054

RESUMO

There is a lack of software engineering skills in bioinformatic contexts. We discuss the consequences of this lack, examine existing explanations and remedies to the problem, point out their shortcomings, and propose alternatives. Previous analyses of the problem have tended to treat the use of software in scientific contexts as categorically different from the general application of software engineering in commercial settings. In contrast, we describe bioinformatic software engineering as a specialization of general software engineering, and examine how it should be practiced. Specifically, we highlight the difference between programming and software engineering, list elements of the latter and present the results of a survey of bioinformatic practitioners which quantifies the extent to which those elements are employed in bioinformatics. We propose that the ideal way to bring engineering values into research projects is to bring engineers themselves. We identify the role of Bioinformatic Engineer and describe how such a role would work within bioinformatic research teams. We conclude by recommending an educational emphasis on cross-training software engineers into life sciences, and propose research on Domain Specific Languages to facilitate collaboration between engineers and bioinformaticians.

Assuntos

Biologia Computacional/educação , Mineração de Dados/métodos , Software/provisão & distribuição , Escolha da Profissão , Biologia Computacional/organização & administração , Desenho de Equipamento , Humanos , Reprodutibilidade dos Testes , Tamanho da Amostra

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA