Search | Brasil - Virtual Health Library

Recommendations for the FAIRification of genomic track metadata.

Gundersen, Sveinung; Boddu, Sanjay; Capella-Gutierrez, Salvador; Drabløs, Finn; Fernández, José M; Kompova, Radmila; Taylor, Kieron; Titov, Dmytro; Zerbino, Daniel; Hovig, Eivind.

F1000Res ; 102021.

Article in English | MEDLINE | ID: mdl-34249331

ABSTRACT

Background: Many types of data from genomic analyses can be represented as genomic tracks, i.e. features linked to the genomic coordinates of a reference genome. Examples of such data are epigenetic DNA methylation data, ChIP-seq peaks, germline or somatic DNA variants, as well as RNA-seq expression levels. Researchers often face difficulties in locating, accessing and combining relevant tracks from external sources, as well as locating the raw data, reducing the value of the generated information. Description of work: We propose to advance the application of FAIR data principles (Findable, Accessible, Interoperable, and Reusable) to produce searchable metadata for genomic tracks. Findability and Accessibility of metadata can then be ensured by a track search service that integrates globally identifiable metadata from various track hubs in the Track Hub Registry and other relevant repositories. Interoperability and Reusability need to be ensured by the specification and implementation of a basic set of recommendations for metadata. We have tested this concept by developing such a specification in a JSON Schema, called FAIRtracks, and have integrated it into a novel track search service, called TrackFind. We demonstrate practical usage by importing datasets through TrackFind into existing examples of relevant analytical tools for genomic tracks: EPICO and the GSuite HyperBrowser. Conclusion: We here provide a first iteration of a draft standard for genomic track metadata, as well as the accompanying software ecosystem. It can easily be adapted or extended to future needs of the research community regarding data, methods and tools, balancing the requirements of both data submitters and analytical end-users.

Subject(s)

Ecosystem , Metadata , Genome , Genomics , Software

The immuneML ecosystem for machine learning analysis of adaptive immune receptor repertoires.

Pavlovic, Milena; Scheffer, Lonneke; Motwani, Keshav; Kanduri, Chakravarthi; Kompova, Radmila; Vazov, Nikolay; Waagan, Knut; Bernal, Fabian L M; Costa, Alexandre Almeida; Corrie, Brian; Akbar, Rahmad; Al Hajj, Ghadi S; Balaban, Gabriel; Brusko, Todd M; Chernigovskaya, Maria; Christley, Scott; Cowell, Lindsay G; Frank, Robert; Grytten, Ivar; Gundersen, Sveinung; Haff, Ingrid Hobæk; Hovig, Eivind; Hsieh, Ping-Han; Klambauer, Günter; Kuijjer, Marieke L; Lund-Andersen, Christin; Martini, Antonio; Minotto, Thomas; Pensar, Johan; Rand, Knut; Riccardi, Enrico; Robert, Philippe A; Rocha, Artur; Slabodkin, Andrei; Snapkov, Igor; Sollid, Ludvig M; Titov, Dmytro; Weber, Cédric R; Widrich, Michael; Yaari, Gur; Greiff, Victor; Sandve, Geir Kjetil.

Nat Mach Intell ; 3(11): 936-944, 2021 Nov.

Article in English | MEDLINE | ID: mdl-37396030

ABSTRACT

Adaptive immune receptor repertoires (AIRR) are key targets for biomedical research as they record past and ongoing adaptive immune responses. The capacity of machine learning (ML) to identify complex discriminative sequence patterns renders it an ideal approach for AIRR-based diagnostic and therapeutic discovery. To date, widespread adoption of AIRR ML has been inhibited by a lack of reproducibility, transparency, and interoperability. immuneML (immuneml.uio.no) addresses these concerns by implementing each step of the AIRR ML process in an extensible, open-source software ecosystem that is based on fully specified and shareable workflows. To facilitate widespread user adoption, immuneML is available as a command-line tool and through an intuitive Galaxy web interface, and extensive documentation of workflows is provided. We demonstrate the broad applicability of immuneML by (i) reproducing a large-scale study on immune state prediction, (ii) developing, integrating, and applying a novel deep learning method for antigen specificity prediction, and (iii) showcasing streamlined interpretability-focused benchmarking of AIRR ML.

Coloc-stats: a unified web interface to perform colocalization analysis of genomic features.

Simovski, Boris; Kanduri, Chakravarthi; Gundersen, Sveinung; Titov, Dmytro; Domanska, Diana; Bock, Christoph; Bossini-Castillo, Lara; Chikina, Maria; Favorov, Alexander; Layer, Ryan M; Mironov, Andrey A; Quinlan, Aaron R; Sheffield, Nathan C; Trynka, Gosia; Sandve, Geir K.

Nucleic Acids Res ; 46(W1): W186-W193, 2018 07 02.

Article in English | MEDLINE | ID: mdl-29873782

ABSTRACT

Functional genomics assays produce sets of genomic regions as one of their main outputs. To biologically interpret such region-sets, researchers often use colocalization analysis, where the statistical significance of colocalization (overlap, spatial proximity) between two or more region-sets is tested. Existing colocalization analysis tools vary in the statistical methodology and analysis approaches, thus potentially providing different conclusions for the same research question. As the findings of colocalization analysis are often the basis for follow-up experiments, it is helpful to use several tools in parallel and to compare the results. We developed the Coloc-stats web service to facilitate such analyses. Coloc-stats provides a unified interface to perform colocalization analysis across various analytical methods and method-specific options (e.g. colocalization measures, resolution, null models). Coloc-stats helps the user to find a method that supports their experimental requirements and allows for a straightforward comparison across methods. Coloc-stats is implemented as a web server with a graphical user interface that assists users with configuring their colocalization analyses. Coloc-stats is freely available at https://hyperbrowser.uio.no/coloc-stats/.

Subject(s)

Genomics/methods , Software , Chromatin Immunoprecipitation , GATA1 Transcription Factor/metabolism , Internet , Sequence Analysis, DNA , User-Computer Interface

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL