Towards Machine-FAIR: Representing software and datasets to facilitate reuse and scientific discovery by machines.
J Biomed Inform
; 154: 104647, 2024 Jun.
Article
em En
| MEDLINE
| ID: mdl-38692465
ABSTRACT
OBJECTIVE:
To use software, datasets, and data formats in the domain of Infectious Disease Epidemiology as a test collection to evaluate a novel M1 use case, which we introduce in this paper. M1 is a machine that upon receipt of a new digital object of research exhaustively finds all valid compositions of it with existing objects.METHOD:
We implemented a data-format-matching-only M1 using exhaustive search, which we refer to as M1DFM. We then ran M1DFM on the test collection and used error analysis to identify needed semantic constraints.RESULTS:
Precision of M1DFM search was 61.7%. Error analysis identified needed semantic constraints and needed changes in handling of data services. Most semantic constraints were simple, but one data format was sufficiently complex to be practically impossible to represent semantic constraints over, from which we conclude limitatively that software developers will have to meet the machines halfway by engineering software whose inputs are sufficiently simple that their semantic constraints can be represented, akin to the simple APIs of services. We summarize these insights as M1-FAIR guiding principles for composability and suggest a roadmap for progressively capable devices in the service of reuse and accelerated scientific discovery.CONCLUSION:
Algorithmic search of digital repositories for valid workflow compositions has potential to accelerate scientific discovery but requires a scalable solution to the problem of knowledge acquisition about semantic constraints on software inputs. Additionally, practical limitations on the logical complexity of semantic constraints must be respected, which has implications for the design of software.Palavras-chave
Texto completo:
1
Base de dados:
MEDLINE
Assunto principal:
Software
Limite:
Humans
Idioma:
En
Revista:
J Biomed Inform
Assunto da revista:
INFORMATICA MEDICA
Ano de publicação:
2024
Tipo de documento:
Article