Pesquisa | BVS Integralidade em Saúde

De-novo FAIRification via an Electronic Data Capture system by automated transformation of filled electronic Case Report Forms into machine-readable data.

Kersloot, Martijn G; Jacobsen, Annika; Groenen, Karlijn H J; Dos Santos Vieira, Bruna; Kaliyaperumal, Rajaram; Abu-Hanna, Ameen; Cornet, Ronald; 't Hoen, Peter A C; Roos, Marco; Schultze Kool, Leo; Arts, Derk L.

J Biomed Inform ; 122: 103897, 2021 10.

Artigo em Inglês | MEDLINE | ID: mdl-34454078

RESUMO

INTRODUCTION: Existing methods to make data Findable, Accessible, Interoperable, and Reusable (FAIR) are usually carried out in a post hoc manner: after the research project is conducted and data are collected. De-novo FAIRification, on the other hand, incorporates the FAIRification steps in the process of a research project. In medical research, data is often collected and stored via electronic Case Report Forms (eCRFs) in Electronic Data Capture (EDC) systems. By implementing a de novo FAIRification process in such a system, the reusability and, thus, scalability of FAIRification across research projects can be greatly improved. In this study, we developed and implemented a novel method for de novo FAIRification via an EDC system. We evaluated our method by applying it to the Registry of Vascular Anomalies (VASCA). METHODS: Our EDC and research project independent method ensures that eCRF data entered into an EDC system can be transformed into machine-readable, FAIR data using a semantic data model (a canonical representation of the data, based on ontology concepts and semantic web standards) and mappings from the model to questions on the eCRF. The FAIRified data are stored in a triple store and can, together with associated metadata, be accessed and queried through a FAIR Data Point. The method was implemented in Castor EDC, an EDC system, through a data transformation application. The FAIRness of the output of the method, the FAIRified data and metadata, was evaluated using the FAIR Evaluation Services. RESULTS: We successfully applied our FAIRification method to the VASCA registry. Data entered on eCRFs is automatically transformed into machine-readable data and can be accessed and queried using SPARQL queries in the FAIR Data Point. Twenty-one FAIR Evaluator tests pass and one test regarding the metadata persistence policy fails, since this policy is not in place yet. CONCLUSION: In this study, we developed a novel method for de novo FAIRification via an EDC system. Its application in the VASCA registry and the automated FAIR evaluation show that the method can be used to make clinical research data FAIR when they are entered in an eCRF without any intervention from data management and data entry personnel. Due to the generic approach and developed tooling, we believe that our method can be used in other registries and clinical trials as well.

Assuntos

Pesquisa Biomédica , Metadados , Gerenciamento de Dados , Eletrônica , Sistema de Registros

Perceptions and behavior of clinical researchers and research support staff regarding data FAIRification.

Kersloot, Martijn G; Abu-Hanna, Ameen; Cornet, Ronald; Arts, Derk L.

Sci Data ; 9(1): 241, 2022 05 27.

Artigo em Inglês | MEDLINE | ID: mdl-35624282

RESUMO

The FAIR Data Principles are being rapidly adopted by many research institutes and funders worldwide. This study aimed to assess the awareness and attitudes of clinical researchers and research support staff regarding data FAIRification. A questionnaire was distributed to researchers and support staff in six Dutch University Medical Centers and Electronic Data Capture platform users. 164 researchers and 21 support staff members completed the questionnaire. 62.8% of the researchers and 81.0% of the support staff are currently undertaking at least some effort to achieve any aspect of FAIR, 11.0% and 23.8%, respectively, address all aspects. Only 46.6% of the researchers add metadata to their datasets, 39.7% add metadata to data elements, and 35.9% deposit their data in a repository. 94.7% of the researchers are aware of the usefulness of their data being FAIR for others and 89.3% are, given the right resources and support, willing to FAIRify their data. Institutions and funders should, therefore, develop FAIRification training and tools and should (financially) support researchers and staff throughout the process.

FAIRification Efforts of Clinical Researchers: The Current State of Affairs.

Kersloot, Martijn G; van Damme, Philip; Abu-Hanna, Ameen; Arts, Derk L; Cornet, Ronald.

Stud Health Technol Inform ; 287: 35-39, 2021 Nov 18.

Artigo em Inglês | MEDLINE | ID: mdl-34795075

RESUMO

The FAIR Principles are supported by various initiatives in the biomedical community. However, little is known about the knowledge and efforts of individual clinical researchers regarding data FAIRification. We distributed an online questionnaire to researchers from six Dutch University Medical Centers, as well as researchers using an Electronic Data Capture platform, to gain insight into their understanding of and experience with data FAIRification. 164 researchers completed the questionnaire. 64.0% of them had heard of the FAIR Principles. 62.8% of the researchers spent some or a lot of effort to achieve any aspect of FAIR and 11.0% addressed all aspects. Most researchers were unaware of the Principles' emphasis on both human- and machine-readability, as their FAIRification efforts were primarily focused on achieving human-readability (93.9%), rather than machine-readability (31.2%). In order to make machine-readable, FAIR data a reality, researchers require proper training, support, and tools to help them understand the importance of data FAIRification and guide them through the FAIRification process.

The de novo FAIRification process of a registry for vascular anomalies.

Groenen, Karlijn H J; Jacobsen, Annika; Kersloot, Martijn G; Dos Santos Vieira, Bruna; van Enckevort, Esther; Kaliyaperumal, Rajaram; Arts, Derk L; 't Hoen, Peter A C; Cornet, Ronald; Roos, Marco; Kool, Leo Schultze.

Orphanet J Rare Dis ; 16(1): 376, 2021 09 04.

Artigo em Inglês | MEDLINE | ID: mdl-34481493

RESUMO

BACKGROUND: Patient data registries that are FAIR-Findable, Accessible, Interoperable, and Reusable for humans and computers-facilitate research across multiple resources. This is particularly relevant to rare diseases, where data often are scarce and scattered. Specific research questions can be asked across FAIR rare disease registries and other FAIR resources without physically combining the data. Further, FAIR implies well-defined, transparent access conditions, which supports making sensitive data as open as possible and as closed as necessary. RESULTS: We successfully developed and implemented a process of making a rare disease registry for vascular anomalies FAIR from its conception-de novo. Here, we describe the five phases of this process in detail: (i) pre-FAIRification, (ii) facilitating FAIRification, (iii) data collection, (iv) generating FAIR data in real-time, and (v) using FAIR data. This includes the creation of an electronic case report form and a semantic data model of the elements to be collected (in this case: the "Set of Common Data Elements for Rare Disease Registration" released by the European Commission), and the technical implementation of automatic, real-time data FAIRification in an Electronic Data Capture system. Further, we describe how we contribute to the four facets of FAIR, and how our FAIRification process can be reused by other registries. CONCLUSIONS: In conclusion, a detailed de novo FAIRification process of a registry for vascular anomalies is described. To a large extent, the process may be reused by other rare disease registries, and we envision this work to be a substantial contribution to an ecosystem of FAIR rare disease resources.

Assuntos

Ecossistema , Doenças Raras , Humanos , Doenças Raras/epidemiologia , Sistema de Registros

Natural language processing algorithms for mapping clinical text fragments onto ontology concepts: a systematic review and recommendations for future studies.

Kersloot, Martijn G; van Putten, Florentien J P; Abu-Hanna, Ameen; Cornet, Ronald; Arts, Derk L.

J Biomed Semantics ; 11(1): 14, 2020 11 16.

Artigo em Inglês | MEDLINE | ID: mdl-33198814

RESUMO

BACKGROUND: Free-text descriptions in electronic health records (EHRs) can be of interest for clinical research and care optimization. However, free text cannot be readily interpreted by a computer and, therefore, has limited value. Natural Language Processing (NLP) algorithms can make free text machine-interpretable by attaching ontology concepts to it. However, implementations of NLP algorithms are not evaluated consistently. Therefore, the objective of this study was to review the current methods used for developing and evaluating NLP algorithms that map clinical text fragments onto ontology concepts. To standardize the evaluation of algorithms and reduce heterogeneity between studies, we propose a list of recommendations. METHODS: Two reviewers examined publications indexed by Scopus, IEEE, MEDLINE, EMBASE, the ACM Digital Library, and the ACL Anthology. Publications reporting on NLP for mapping clinical text from EHRs to ontology concepts were included. Year, country, setting, objective, evaluation and validation methods, NLP algorithms, terminology systems, dataset size and language, performance measures, reference standard, generalizability, operational use, and source code availability were extracted. The studies' objectives were categorized by way of induction. These results were used to define recommendations. RESULTS: Two thousand three hundred fifty five unique studies were identified. Two hundred fifty six studies reported on the development of NLP algorithms for mapping free text to ontology concepts. Seventy-seven described development and evaluation. Twenty-two studies did not perform a validation on unseen data and 68 studies did not perform external validation. Of 23 studies that claimed that their algorithm was generalizable, 5 tested this by external validation. A list of sixteen recommendations regarding the usage of NLP systems and algorithms, usage of data, evaluation and validation, presentation of results, and generalizability of results was developed. CONCLUSION: We found many heterogeneous approaches to the reporting on the development and evaluation of NLP algorithms that map clinical text to ontology concepts. Over one-fourth of the identified publications did not perform an evaluation. In addition, over one-fourth of the included studies did not perform a validation, and 88% did not perform external validation. We believe that our recommendations, alongside an existing reporting standard, will increase the reproducibility and reusability of future studies and NLP algorithms in medicine.

Assuntos

Algoritmos , Ontologias Biológicas , Processamento de Linguagem Natural , Humanos

Automated SNOMED CT concept and attribute relationship detection through a web-based implementation of cTAKES.

Kersloot, Martijn G; Lau, Francis; Abu-Hanna, Ameen; Arts, Derk L; Cornet, Ronald.

J Biomed Semantics ; 10(1): 14, 2019 09 18.

Artigo em Inglês | MEDLINE | ID: mdl-31533810

RESUMO

BACKGROUND: Information in Electronic Health Records is largely stored as unstructured free text. Natural language processing (NLP), or Medical Language Processing (MLP) in medicine, aims at extracting structured information from free text, and is less expensive and time-consuming than manual extraction. However, most algorithms in MLP are institution-specific or address only one clinical need, and thus cannot be broadly applied. In addition, most MLP systems do not detect concepts in misspelled text and cannot detect attribute relationships between concepts. The objective of this study was to develop and evaluate an MLP application that includes generic algorithms for the detection of (misspelled) concepts and of attribute relationships between them. METHODS: An implementation of the MLP system cTAKES, called DIRECT, was developed with generic SNOMED CT concept filter, concept relationship detection, and attribute relationship detection algorithms and a custom dictionary. Four implementations of cTAKES were evaluated by comparing 98 manually annotated oncology charts with the output of DIRECT. The F1-score was determined for named-entity recognition and attribute relationship detection for the concepts 'lung cancer', 'non-small cell lung cancer', and 'recurrence'. The performance of the four implementations was compared with a two-tailed permutation test. RESULTS: DIRECT detected lung cancer and non-small cell lung cancer concepts with F1-scores between 0.828 and 0.947 and between 0.862 and 0.933, respectively. The concept recurrence was detected with a significantly higher F1-score of 0.921, compared to the other implementations, and the relationship between recurrence and lung cancer with an F1-score of 0.857. The precision of the detection of lung cancer, non-small cell lung cancer, and recurrence concepts were 1.000, 0.966, and 0.879, compared to precisions of 0.943, 0.967, and 0.000 in the original implementation, respectively. CONCLUSION: DIRECT can detect oncology concepts and attribute relationships with high precision and can detect recurrence with significant increase in F1-score, compared to the original implementation of cTAKES, due to the usage of a custom dictionary and a generic concept relationship detection algorithm. These concepts and relationships can be used to encode clinical narratives, and can thus substantially reduce manual chart abstraction efforts, saving time for clinicians and researchers.

Assuntos

Internet , Processamento de Linguagem Natural , Automação , Carcinoma Pulmonar de Células não Pequenas/diagnóstico , Registros Eletrônicos de Saúde , Humanos , Neoplasias Pulmonares/diagnóstico

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

Detalhe da pesquisa