Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Int J Digit Libr ; 25(2): 273-285, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38948004

RESUMO

Due to the growing number of scholarly publications, finding relevant articles becomes increasingly difficult. Scholarly knowledge graphs can be used to organize the scholarly knowledge presented within those publications and represent them in machine-readable formats. Natural language processing (NLP) provides scalable methods to automatically extract knowledge from articles and populate scholarly knowledge graphs. However, NLP extraction is generally not sufficiently accurate and, thus, fails to generate high granularity quality data. In this work, we present TinyGenius, a methodology to validate NLP-extracted scholarly knowledge statements using microtasks performed with crowdsourcing. TinyGenius is employed to populate a paper-centric knowledge graph, using five distinct NLP methods. We extend our previous work of the TinyGenius methodology in various ways. Specifically, we discuss the NLP tasks in more detail and include an explanation of the data model. Moreover, we present a user evaluation where participants validate the generated NLP statements. The results indicate that employing microtasks for statement validation is a promising approach despite the varying participant agreement for different microtasks.

2.
J Biomed Semantics ; 14(1): 18, 2023 Nov 28.
Artigo em Inglês | MEDLINE | ID: mdl-38017587

RESUMO

Multiple studies have investigated bibliometric features and uncategorized scholarly documents for the influential scholarly document prediction task. In this paper, we describe our work that attempts to go beyond bibliometric metadata to predict influential scholarly documents. Furthermore, this work also examines the influential scholarly document prediction task over categorized scholarly documents. We also introduce a new approach to enhance the document representation method with a domain-independent knowledge graph to find the influential scholarly document using categorized scholarly content. As the input collection, we use the WHO corpus with scholarly documents on the theme of COVID-19. This study examines different document representation methods for machine learning, including TF-IDF, BOW, and embedding-based language models (BERT). The TF-IDF document representation method works better than others. From various machine learning methods tested, logistic regression outperformed the other for scholarly document category classification, and the random forest algorithm obtained the best results for influential scholarly document prediction, with the help of a domain-independent knowledge graph, specifically DBpedia, to enhance the document representation method for predicting influential scholarly documents with categorical scholarly content. In this case, our study combines state-of-the-art machine learning methods with the BOW document representation method. We also enhance the BOW document representation with the direct type (RDF type) and unqualified relation from DBpedia. From this experiment, we did not find any impact of the enhanced document representation for the scholarly document category classification. We found an effect in the influential scholarly document prediction with categorical data.


Assuntos
COVID-19 , Reconhecimento Automatizado de Padrão , Humanos , Aprendizado de Máquina , Algoritmos , Idioma
3.
Sci Rep ; 13(1): 7240, 2023 May 04.
Artigo em Inglês | MEDLINE | ID: mdl-37142627

RESUMO

Knowledge graphs have gained increasing popularity in the last decade in science and technology. However, knowledge graphs are currently relatively simple to moderate semantic structures that are mainly a collection of factual statements. Question answering (QA) benchmarks and systems were so far mainly geared towards encyclopedic knowledge graphs such as DBpedia and Wikidata. We present SciQA a scientific QA benchmark for scholarly knowledge. The benchmark leverages the Open Research Knowledge Graph (ORKG) which includes almost 170,000 resources describing research contributions of almost 15,000 scholarly articles from 709 research fields. Following a bottom-up methodology, we first manually developed a set of 100 complex questions that can be answered using this knowledge graph. Furthermore, we devised eight question templates with which we automatically generated further 2465 questions, that can also be answered with the ORKG. The questions cover a range of research fields and question types and are translated into corresponding SPARQL queries over the ORKG. Based on two preliminary evaluations, we show that the resulting SciQA benchmark represents a challenging task for next-generation QA systems. This task is part of the open competitions at the 22nd International Semantic Web Conference 2023 as the Scholarly Question Answering over Linked Data (QALD) Challenge.

4.
Knowl Inf Syst ; 65(5): 1989-2016, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36643405

RESUMO

In the last decade, a large number of knowledge graph (KG) completion approaches were proposed. Albeit effective, these efforts are disjoint, and their collective strengths and weaknesses in effective KG completion have not been studied in the literature. We extend Plumber, a framework that brings together the research community's disjoint efforts on KG completion. We include more components into the architecture of Plumber  to comprise 40 reusable components for various KG completion subtasks, such as coreference resolution, entity linking, and relation extraction. Using these components, Plumber dynamically generates suitable knowledge extraction pipelines and offers overall 432 distinct pipelines. We study the optimization problem of choosing optimal pipelines based on input sentences. To do so, we train a transformer-based classification model that extracts contextual embeddings from the input and finds an appropriate pipeline. We study the efficacy of Plumber for extracting the KG triples using standard datasets over three KGs: DBpedia, Wikidata, and Open Research Knowledge Graph. Our results demonstrate the effectiveness of Plumber in dynamically generating KG completion pipelines, outperforming all baselines agnostic of the underlying KG. Furthermore, we provide an analysis of collective failure cases, study the similarities and synergies among integrated components and discuss their limitations.

5.
Front Res Metr Anal ; 7: 934930, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35928800

RESUMO

Scholarly knowledge graphs provide researchers with a novel modality of information retrieval, and their wider use in academia is beneficial for the digitalization of published works and the development of scholarly communication. To increase the acceptance of scholarly knowledge graphs, we present a dashboard, which visualizes the research contributions on an educational science topic in the frame of the Open Research Knowledge Graph (ORKG). As dashboards are created at the intersection of computer science, graphic design, and human-technology interaction, we used these three perspectives to develop a multi-relational visualization tool aimed at improving the user experience. According to preliminary results of the user evaluation survey, the dashboard was perceived as more appealing than the baseline ORKG-powered interface. Our findings can be used for the development of scholarly knowledge graph-powered dashboards in different domains, thus facilitating acceptance of these novel instruments by research communities and increasing versatility in scholarly communication.

6.
Scientometrics ; 126(9): 8129-8151, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34276109

RESUMO

The publish or perish culture of scholarly communication results in quality and relevance to be are subordinate to quantity. Scientific events such as conferences play an important role in scholarly communication and knowledge exchange. Researchers in many fields, such as computer science, often need to search for events to publish their research results, establish connections for collaborations with other researchers and stay up to date with recent works. Researchers need to have a meta-research understanding of the quality of scientific events to publish in high-quality venues. However, there are many diverse and complex criteria to be explored for the evaluation of events. Thus, finding events with quality-related criteria becomes a time-consuming task for researchers and often results in an experience-based subjective evaluation. OpenResearch.org is a crowd-sourcing platform that provides features to explore previous and upcoming events of computer science, based on a knowledge graph. In this paper, we devise an ontology representing scientific events metadata. Furthermore, we introduce an analytical study of the evolution of Computer Science events leveraging the OpenResearch.org knowledge graph. We identify common characteristics of these events, formalize them, and combine them as a group of metrics. These metrics can be used by potential authors to identify high-quality events. On top of the improved ontology, we analyzed the metadata of renowned conferences in various computer science communities, such as VLDB, ISWC, ESWC, WIMS, and SEMANTiCS, in order to inspect their potential as event metrics.

7.
Scientometrics ; 126(1): 641-682, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33169040

RESUMO

Systematic assessment of scientific events has become increasingly important for research communities. A range of metrics (e.g., citations, h-index) have been developed by different research communities to make such assessments effectual. However, most of the metrics for assessing the quality of less formal publication venues and events have not yet deeply investigated. It is also rather challenging to develop respective metrics because each research community has its own formal and informal rules of communication and quality standards. In this article, we develop a comprehensive framework of assessment metrics for evaluating scientific events and involved stakeholders. The resulting quality metrics are determined with respect to three general categories-events, persons, and bibliometrics. Our assessment methodology is empirically applied to several series of computer science events, such as conferences and workshops, using publicly available data for determining quality metrics. We show that the metrics' values coincide with the intuitive agreement of the community on its "top conferences". Our results demonstrate that highly-ranked events share similar profiles, including the provision of outstanding reviews, visiting diverse locations, having reputed people involved, and renowned sponsors.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...