Search | VHL Search Portal

A SARS-CoV-2 sequence submission tool for the European Nucleotide Archive.

Roncoroni, Miguel; Droesbeke, Bert; Eguinoa, Ignacio; De Ruyck, Kim; D'Anna, Flora; Yusuf, Dilmurat; Grüning, Björn; Backofen, Rolf; Coppens, Frederik.

Bioinformatics ; 37(21): 3983-3985, 2021 11 05.

Article in English | MEDLINE | ID: mdl-34096994

ABSTRACT

SUMMARY: Many aspects of the global response to the COVID-19 pandemic are enabled by the fast and open publication of SARS-CoV-2 genetic sequence data. The European Nucleotide Archive (ENA) is the European recommended open repository for genetic sequences. In this work, we present a tool for submitting raw sequencing reads of SARS-CoV-2 to ENA. The tool features a single-step submission process, a graphical user interface, tabular-formatted metadata and the possibility to remove human reads prior to submission. A Galaxy wrap of the tool allows users with little or no bioinformatics knowledge to do bulk sequencing read submissions. The tool is also packed in a Docker container to ease deployment. AVAILABILITY AND IMPLEMENTATION: CLI ENA upload tool is available at github.com/usegalaxy-eu/ena-upload-cli (DOI 10.5281/zenodo.4537621); Galaxy ENA upload tool at toolshed.g2.bx.psu.edu/view/iuc/ena_upload/382518f24d6d and github.com/galaxyproject/tools-iuc/tree/master/tools/ena_upload (development); and ENA upload Galaxy container at github.com/ELIXIR-Belgium/ena-upload-container (DOI 10.5281/zenodo.4730785).

Subject(s)

COVID-19 , Software , Humans , SARS-CoV-2 , Nucleotides , Pandemics

No more business as usual: Agile and effective responses to emerging pathogen threats require open data and open analytics.

Baker, Dannon; van den Beek, Marius; Blankenberg, Daniel; Bouvier, Dave; Chilton, John; Coraor, Nate; Coppens, Frederik; Eguinoa, Ignacio; Gladman, Simon; Grüning, Björn; Keener, Nicholas; Larivière, Delphine; Lonie, Andrew; Kosakovsky Pond, Sergei; Maier, Wolfgang; Nekrutenko, Anton; Taylor, James; Weaver, Steven.

PLoS Pathog ; 16(8): e1008643, 2020 08.

Article in English | MEDLINE | ID: mdl-32790776

ABSTRACT

The current state of much of the Wuhan pneumonia virus (severe acute respiratory syndrome coronavirus 2 [SARS-CoV-2]) research shows a regrettable lack of data sharing and considerable analytical obfuscation. This impedes global research cooperation, which is essential for tackling public health emergencies and requires unimpeded access to data, analysis tools, and computational infrastructure. Here, we show that community efforts in developing open analytical software tools over the past 10 years, combined with national investments into scientific computational infrastructure, can overcome these deficiencies and provide an accessible platform for tackling global health emergencies in an open and transparent manner. Specifically, we use all SARS-CoV-2 genomic data available in the public domain so far to (1) underscore the importance of access to raw data and (2) demonstrate that existing community efforts in curation and deployment of biomedical software can reliably support rapid, reproducible research during global health crises. All our analyses are fully documented at https://github.com/galaxyproject/SARS-CoV-2.

Subject(s)

Betacoronavirus/pathogenicity , Coronavirus Infections/virology , Pneumonia, Viral/virology , Public Health , Severe Acute Respiratory Syndrome/virology , COVID-19 , Data Analysis , Humans , Pandemics , SARS-CoV-2

The Arabidopsis condensin CAP-D subunits arrange interphase chromatin.

Municio, Celia; Antosz, Wojciech; Grasser, Klaus D; Kornobis, Etienne; Van Bel, Michiel; Eguinoa, Ignacio; Coppens, Frederik; Bräutigam, Andrea; Lermontova, Inna; Bruckmann, Astrid; Zelkowska, Katarzyna; Houben, Andreas; Schubert, Veit.

New Phytol ; 230(3): 972-987, 2021 05.

Article in English | MEDLINE | ID: mdl-33475158

ABSTRACT

Condensins are best known for their role in shaping chromosomes. Other functions such as organizing interphase chromatin and transcriptional control have been reported in yeasts and animals, but little is known about their function in plants. To elucidate the specific composition of condensin complexes and the expression of CAP-D2 (condensin I) and CAP-D3 (condensin II), we performed biochemical analyses in Arabidopsis. The role of CAP-D3 in interphase chromatin organization and function was evaluated using cytogenetic and transcriptome analysis in cap-d3 T-DNA insertion mutants. CAP-D2 and CAP-D3 are highly expressed in mitotically active tissues. In silico and pull-down experiments indicate that both CAP-D proteins interact with the other condensin I and II subunits. In cap-d3 mutants, an association of heterochromatic sequences occurs, but the nuclear size and the general histone and DNA methylation patterns remain unchanged. Also, CAP-D3 influences the expression of genes affecting the response to water, chemicals, and stress. The expression and composition of the condensin complexes in Arabidopsis are similar to those in other higher eukaryotes. We propose a model for the CAP-D3 function during interphase in which CAP-D3 localizes in euchromatin loops to stiffen them and consequently separates centromeric regions and 45S rDNA repeats.

Subject(s)

Arabidopsis , Chromatin , Adenosine Triphosphatases/genetics , Animals , Arabidopsis/genetics , DNA-Binding Proteins , Interphase , Multiprotein Complexes

Ten simple rules for making a software tool workflow-ready.

Brack, Paul; Crowther, Peter; Soiland-Reyes, Stian; Owen, Stuart; Lowe, Douglas; Williams, Alan R; Groom, Quentin; Dillen, Mathias; Coppens, Frederik; Grüning, Björn; Eguinoa, Ignacio; Ewels, Philip; Goble, Carole.

PLoS Comput Biol ; 18(3): e1009823, 2022 03.

Article in English | MEDLINE | ID: mdl-35324885

Subject(s)

Computational Biology , Software , Workflow

Expanding the Galaxy's reference data.

VijayKrishna, Nagampalli; Joshi, Jayadev; Coraor, Nate; Hillman-Jackson, Jennifer; Bouvier, Dave; van den Beek, Marius; Eguinoa, Ignacio; Coppens, Frederik; Davis, John; Stolarczyk, Michal; Sheffield, Nathan C; Gladman, Simon; Cuccuru, Gianmauro; Grüning, Björn; Soranzo, Nicola; Rasche, Helena; Langhorst, Bradley W; Bernt, Matthias; Fornika, Dan; de Lima Morais, David Anderson; Barrette, Michel; van Heusden, Peter; Petrillo, Mauro; Puertas-Gallardo, Antonio; Patak, Alex; Hotz, Hans-Rudolf; Blankenberg, Daniel.

Bioinform Adv ; 2(1): vbac030, 2022.

Article in English | MEDLINE | ID: mdl-35669346

ABSTRACT

Summary: Properly and effectively managing reference datasets is an important task for many bioinformatics analyses. Refgenie is a reference asset management system that allows users to easily organize, retrieve and share such datasets. Here, we describe the integration of refgenie into the Galaxy platform. Server administrators are able to configure Galaxy to make use of reference datasets made available on a refgenie instance. In addition, a Galaxy Data Manager tool has been developed to provide a graphical interface to refgenie's remote reference retrieval functionality. A large collection of reference datasets has also been made available using the CVMFS (CernVM File System) repository from GalaxyProject.org, with mirrors across the USA, Canada, Europe and Australia, enabling easy use outside of Galaxy. Availability and implementation: The ability of Galaxy to use refgenie assets was added to the core Galaxy framework in version 22.01, which is available from https://github.com/galaxyproject/galaxy under the Academic Free License version 3.0. The refgenie Data Manager tool can be installed via the Galaxy ToolShed, with source code managed at https://github.com/BlankenbergLab/galaxy-tools-blankenberg/tree/main/data_managers/data_manager_refgenie_pull and released using an MIT license. Access to existing data is also available through CVMFS, with instructions at https://galaxyproject.org/admin/reference-data-repo/. No new data were generated or analyzed in support of this research.

Precursor Intensity-Based Label-Free Quantification Software Tools for Proteomic and Multi-Omic Analysis within the Galaxy Platform.

Mehta, Subina; Easterly, Caleb W; Sajulga, Ray; Millikin, Robert J; Argentini, Andrea; Eguinoa, Ignacio; Martens, Lennart; Shortreed, Michael R; Smith, Lloyd M; McGowan, Thomas; Kumar, Praveen; Johnson, James E; Griffin, Timothy J; Jagtap, Pratik D.

Proteomes ; 8(3)2020 Jul 08.

Article in English | MEDLINE | ID: mdl-32650610

ABSTRACT

For mass spectrometry-based peptide and protein quantification, label-free quantification (LFQ) based on precursor mass peak (MS1) intensities is considered reliable due to its dynamic range, reproducibility, and accuracy. LFQ enables peptide-level quantitation, which is useful in proteomics (analyzing peptides carrying post-translational modifications) and multi-omics studies such as metaproteomics (analyzing taxon-specific microbial peptides) and proteogenomics (analyzing non-canonical sequences). Bioinformatics workflows accessible via the Galaxy platform have proven useful for analysis of such complex multi-omic studies. However, workflows within the Galaxy platform have lacked well-tested LFQ tools. In this study, we have evaluated moFF and FlashLFQ, two open-source LFQ tools, and implemented them within the Galaxy platform to offer access and use via established workflows. Through rigorous testing and communication with the tool developers, we have optimized the performance of each tool. Software features evaluated include: (a) match-between-runs (MBR); (b) using multiple file-formats as input for improved quantification; (c) use of containers and/or conda packages; (d) parameters needed for analyzing large datasets; and (e) optimization and validation of software performance. This work establishes a process for software implementation, optimization, and validation, and offers access to two robust software tools for LFQ-based analysis within the Galaxy platform.

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

Subject(s)

ABSTRACT

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL