Your browser doesn't support javascript.
loading
Querying semantic catalogues of biomedical databases.
Pereira, Arnaldo; Almeida, João Rafael; Lopes, Rui Pedro; Oliveira, José Luís.
Affiliation
  • Pereira A; DETI/IEETA, LASI, University of Aveiro, Aveiro, Portugal. Electronic address: arnaldop@ua.pt.
  • Almeida JR; DETI/IEETA, LASI, University of Aveiro, Aveiro, Portugal; Department of Computation, University of A Coruña, A Coruña, Spain. Electronic address: joao.rafael.almeida@ua.pt.
  • Lopes RP; CeDRI, Polytechnic Institute of Bragança, Bragança, Portugal. Electronic address: rlopes@ipb.pt.
  • Oliveira JL; DETI/IEETA, LASI, University of Aveiro, Aveiro, Portugal. Electronic address: jlo@ua.pt.
J Biomed Inform ; 137: 104272, 2023 01.
Article in En | MEDLINE | ID: mdl-36563828
ABSTRACT

BACKGROUND:

Secondary use of health data is a valuable source of knowledge that boosts observational studies, leading to important discoveries in the medical and biomedical sciences. The fundamental guiding principle for performing a successful observational study is the research question and the approach in advance of executing a study. However, in multi-centre studies, finding suitable datasets to support the study is challenging, time-consuming, and sometimes impossible without a deep understanding of each dataset.

METHODS:

We propose a strategy for retrieving biomedical datasets of interest that were semantically annotated, using an interface built by applying a methodology for transforming natural language questions into formal language queries. The advantages of creating biomedical semantic data are enhanced by using natural language interfaces to issue complex queries without manipulating a logical query language.

RESULTS:

Our methodology was validated using Alzheimer's disease datasets published in a European platform for sharing and reusing biomedical data. We converted data to semantic information format using biomedical ontologies in everyday use in the biomedical community and published it as a FAIR endpoint. We have considered natural language questions of three types single-concept questions, questions with exclusion criteria, and multi-concept questions. Finally, we analysed the performance of the question-answering module we used and its limitations. The source code is publicly available at https//bioinformatics-ua.github.io/BioKBQA/.

CONCLUSION:

We propose a strategy for using information extracted from biomedical data and transformed into a semantic format using open biomedical ontologies. Our method uses natural language to formulate questions to be answered by this semantic data without the direct use of formal query languages.
Subject(s)
Key words

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Semantics / Natural Language Processing Type of study: Observational_studies Language: En Journal: J Biomed Inform Year: 2023 Document type: Article

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Semantics / Natural Language Processing Type of study: Observational_studies Language: En Journal: J Biomed Inform Year: 2023 Document type: Article