Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 833
Filtrar
1.
Database (Oxford) ; 20222022 Jul 15.
Artigo em Inglês | MEDLINE | ID: mdl-35849027

RESUMO

In this research, we explored various state-of-the-art biomedical-specific pre-trained Bidirectional Encoder Representations from Transformers (BERT) models for the National Library of Medicine - Chemistry (NLM CHEM) and LitCovid tracks in the BioCreative VII Challenge, and propose a BERT-based ensemble learning approach to integrate the advantages of various models to improve the system's performance. The experimental results of the NLM-CHEM track demonstrate that our method can achieve remarkable performance, with F1-scores of 85% and 91.8% in strict and approximate evaluations, respectively. Moreover, the proposed Medical Subject Headings identifier (MeSH ID) normalization algorithm is effective in entity normalization, which achieved a F1-score of about 80% in both strict and approximate evaluations. For the LitCovid track, the proposed method is also effective in detecting topics in the Coronavirus disease 2019 (COVID-19) literature, which outperformed the compared methods and achieve state-of-the-art performance in the LitCovid corpus. Database URL: https://www.ncbi.nlm.nih.gov/research/coronavirus/.


Assuntos
COVID-19 , Mineração de Dados , Mineração de Dados/métodos , Humanos , Aprendizado de Máquina , Medical Subject Headings , PubMed
2.
Medicine (Baltimore) ; 101(27): e29213, 2022 Jul 08.
Artigo em Inglês | MEDLINE | ID: mdl-35801759

RESUMO

BACKGROUND: We saw a steady increase in the number of bibliographic studies published over the years. The reason for this rise is attributed to the better accessibility of bibliographic data and software packages that specialize in bibliographic analyses. Any difference in citation achievements between bibliographic and meta-analysis studies observed so far need to be verified. In this study, we aimed to identify the frequently observed MeSH terms in these 2 types of study and investigate whether the highlighted MeSH terms are strongly associated with one of the study types. METHODS: By searching the PubMed Central database, 5121 articles relevant to bibliometric and meta-analysis studies were downloaded since 2011. Social network analysis was applied to highlight the major MeSH terms of quantitative and statistical methods in these 2 types of studies. MeSH terms were then individually tested for any differences in event counts over the years between study types using odds of 95% confidence intervals for comparison. RESULTS: In these 2 studies, we found that the most productive countries were the United States (19.9%), followed by the United Kingdom (8.8%) and China (8.7%); the most number of articles were published in PLoS One (2.9%), Stat Med (2.5%), and Res Synth (2.4%); and the most frequently observed MeSH terms were statistics and numerical data in bibliographic studies and methods in meta-analysis. Differences were found when compared to the event counts and the citation achievements in these 2 study types. CONCLUSION: The breakthrough was made by developing a dashboard using forest plots to display the difference in event counts. The visualization of the observed MeSH terms could be replicated for future academic pursuits and applications in other disciplines using the odds of 95% confidence intervals.


Assuntos
Bibliometria , Metanálise como Assunto , Humanos , Medical Subject Headings , PubMed , Estudos Retrospectivos , Estados Unidos
3.
Medicine (Baltimore) ; 101(30): e29396, 2022 Jul 29.
Artigo em Inglês | MEDLINE | ID: mdl-35905256

RESUMO

BACKGROUND: Psoriasis Vulgaris is a chronic inflammatory disease characterized by keratinocyte hyperproliferation. Bibliometric analysis helps determine the most influential article on the topic of "Psoriasis Vulgaris and biological agents (PVBAs)", and what factors affect article citation remain unclear. This study aims (1) to identify the top 100 most cited articles in PVBA (PVBA100 for short) from 1991 to 2020, (2) to visualize dominant entities on one diagram using data in PVBA100, and (3) to investigate whether medical subject headings (MeSH terms) can be used to predict article citations. METHODS: The top 100 most cited articles relevant to PVBA (1991-2020) were downloaded by searching the PubMed database. Citation analysis was applied to compare the dominant roles in article types and topic categories using pyramid plots. Social network analysis (SNA) and Sankey diagrams were applied to highlight prominent entities. We examined the MeSH prediction effect on article citations using its correlation coefficients. RESULTS: The most frequent article types and topic categories were research support by institutes (46%) and drug therapy (88%), respectively. The most productive countries were the United States (38%), followed by Germany (13%) and Japan (12%). Most articles were published in Br J Dermatol (13%) and J Invest Dermatol (11%). MeSH terms were evident in the prediction power of the number of article citations (correlation coefficient=0.45, t=4.99). CONCLUSIONS: The breakthrough was made by developing one dashboard to display PVBA100. MeSH terms can be used for predicting article citations in PVBA100. These visualizations of PVBA100 could be applied to future academic pursuits and applications in other academic disciplines.


Assuntos
Fatores Biológicos , Psoríase , Bibliometria , Humanos , Medical Subject Headings , Psoríase/tratamento farmacológico , Publicações , Estados Unidos
4.
Stud Health Technol Inform ; 295: 37-40, 2022 Jun 29.
Artigo em Inglês | MEDLINE | ID: mdl-35773799

RESUMO

Medical Subject Headings (MeSH) is one of the most important vocabularies for information retrieval in medical research. It enables fast and reliable retrieval of research on PubMed/MEDLINE, the world's largest body of medical literature. The original English version of the thesaurus can be accessed via a MeSH Browser developed by the NLM. Recently, a multilingual MeSH Browser was proposed to enable usage across languages. To improve upon the original system, a new user interface (UI) was developed using contemporary web design frameworks in combination with principles from cognitive science. It aims to simplify access for medical professionals and increase overall usability. Evaluating such design improvements continually is necessary to quantify the possible positive impact for online systems in medical research. This study therefore directly compares the resulting system to the NLM Browser, using an established online questionnaire. Results show significant improvements in content and navigation as well as overall user satisfaction, while offering feedback for future improvements. This underlines the benefits of employing contemporary web design in terms of usability and user satisfaction.


Assuntos
Medical Subject Headings , Multilinguismo , MEDLINE , PubMed
5.
BMC Bioinformatics ; 23(1): 259, 2022 Jun 29.
Artigo em Inglês | MEDLINE | ID: mdl-35768777

RESUMO

BACKGROUND: The COVID-19 pandemic has increasingly accelerated the publication pace of scientific literature. How to efficiently curate and index this large amount of biomedical literature under the current crisis is of great importance. Previous literature indexing is mainly performed by human experts using Medical Subject Headings (MeSH), which is labor-intensive and time-consuming. Therefore, to alleviate the expensive time consumption and monetary cost, there is an urgent need for automatic semantic indexing technologies for the emerging COVID-19 domain. RESULTS: In this research, to investigate the semantic indexing problem for COVID-19, we first construct the new COVID-19 Semantic Indexing dataset, which consists of more than 80 thousand biomedical articles. We then propose a novel semantic indexing framework based on the multi-probe attention neural network (MPANN) to address the COVID-19 semantic indexing problem. Specifically, we employ a k-nearest neighbour based MeSH masking approach to generate candidate topic terms for each input article. We encode and feed the selected candidate terms as well as other contextual information as probes into the downstream attention-based neural network. Each semantic probe carries specific aspects of biomedical knowledge and provides informatively discriminative features for the input article. After extracting the semantic features at both term-level and document-level through the attention-based neural network, MPANN adopts a linear multi-view classifier to conduct the final topic prediction for COVID-19 semantic indexing. CONCLUSION: The experimental results suggest that MPANN promises to represent the semantic features of biomedical texts and is effective in predicting semantic topics for COVID-19 related biomedical articles.


Assuntos
COVID-19 , Semântica , Humanos , Medical Subject Headings , Redes Neurais de Computação , Pandemias
6.
Methods Mol Biol ; 2496: 283-299, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35713870

RESUMO

Text mining is an important research area to be explored in terms of understanding disease associations and have an insight in disease comorbidities. The reason for comorbid occurrence in any patient may be genetic or molecular interference from any other processes. Comorbidity and multimorbidity may be technically different, yet still are inseparable in studies. They have overlapping nature of associations and hence can be integrated for a more rational approach. The association rule generally used to determine comorbidity may also be helpful in novel knowledge prediction or may even serve as an important tool of assessment in surgical cases. Another approach of interest may be to utilize biological vocabulary resources like UMLS/MeSH across a patient health information and analyze the interrelationship between different health conditions. The protocol presented here can be utilized for understanding the disease associations and analyze at an extensive level.


Assuntos
Indexação e Redação de Resumos , Medical Subject Headings , Mineração de Dados , Humanos , Processamento de Linguagem Natural , PubMed
7.
BMC Med Res Methodol ; 22(1): 141, 2022 05 14.
Artigo em Inglês | MEDLINE | ID: mdl-35568796

RESUMO

BACKGROUND: Screening for eligible patients continues to pose a great challenge for many clinical trials. This has led to a rapidly growing interest in standardizing computable representations of eligibility criteria (EC) in order to develop tools that leverage data from electronic health record (EHR) systems. Although laboratory procedures (LP) represent a common entity of EC that is readily available and retrievable from EHR systems, there is a lack of interoperable data models for this entity of EC. A public, specialized data model that utilizes international, widely-adopted terminology for LP, e.g. Logical Observation Identifiers Names and Codes (LOINC®), is much needed to support automated screening tools. OBJECTIVE: The aim of this study is to establish a core dataset for LP most frequently requested to recruit patients for clinical trials using LOINC terminology. Employing such a core dataset could enhance the interface between study feasibility platforms and EHR systems and significantly improve automatic patient recruitment. METHODS: We used a semi-automated approach to analyze 10,516 screening forms from the Medical Data Models (MDM) portal's data repository that are pre-annotated with Unified Medical Language System (UMLS). An automated semantic analysis based on concept frequency is followed by an extensive manual expert review performed by physicians to analyze complex recruitment-relevant concepts not amenable to automatic approach. RESULTS: Based on analysis of 138,225 EC from 10,516 screening forms, 55 laboratory procedures represented 77.87% of all UMLS laboratory concept occurrences identified in the selected EC forms. We identified 26,413 unique UMLS concepts from 118 UMLS semantic types and covered the vast majority of Medical Subject Headings (MeSH) disease domains. CONCLUSIONS: Only a small set of common LP covers the majority of laboratory concepts in screening EC forms which supports the feasibility of establishing a focused core dataset for LP. We present ELaPro, a novel, LOINC-mapped, core dataset for the most frequent 55 LP requested in screening for clinical trials. ELaPro is available in multiple machine-readable data formats like CSV, ODM and HL7 FHIR. The extensive manual curation of this large number of free-text EC as well as the combining of UMLS and LOINC terminologies distinguishes this specialized dataset from previous relevant datasets in the literature.


Assuntos
Logical Observation Identifiers Names and Codes , Medical Subject Headings , Humanos , Semântica
8.
Stud Health Technol Inform ; 294: 357-361, 2022 May 25.
Artigo em Inglês | MEDLINE | ID: mdl-35612096

RESUMO

The distributed nature of our digital healthcare and the rapid emergence of new data sources prevents a compelling overview and the joint use of new data. Data integration, e.g., with metadata and semantic annotations, is expected to overcome this challenge. In this paper, we present an approach to predict UMLS codes to given German metadata using recurrent neural networks. The augmentation of the training dataset using the Medical Subject Headings (MeSH), particularly the German translations, also improved the model accuracy. The model demonstrates robust performance with 75% accuracy and aims to show that increasingly sophisticated machine learning tools can already play a significant role in data integration.


Assuntos
Metadados , Semântica , Armazenamento e Recuperação da Informação , Medical Subject Headings , Redes Neurais de Computação , Unified Medical Language System
9.
Stud Health Technol Inform ; 294: 403-404, 2022 May 25.
Artigo em Inglês | MEDLINE | ID: mdl-35612105

RESUMO

OBJECTIVE: The aim of this paper is to propose an extended translation of the MeSH thesaurus based on Wikipedia pages. METHODS: A mapping was realized between each MeSH descriptor (preferred terms and synonyms) and corresponding Wikipedia pages. RESULTS: A tool called "WikiMeSH" has been developed. Among the top 20 languages of this study, seven have currently no MeSH translations: Arabic, Catalan, Farsi (Iran), Mandarin Chinese, Korean, Serbian, and Ukrainian. For these seven languages, WikiMeSH is proposing a translation for 47% for Arabic to 34% for Serbian. CONCLUSION: WikiMeSH is an interesting tool to translate the MeSH thesaurus and other health terminologies and ontologies based on a mapping to Wikipedia pages.


Assuntos
Medical Subject Headings , Tradução , Idioma , Traduções , Vocabulário Controlado
10.
Stud Health Technol Inform ; 294: 876-877, 2022 May 25.
Artigo em Inglês | MEDLINE | ID: mdl-35612233

RESUMO

We present an analysis of supplementary materials of PubMed Central (PMC) articles and show their importance in indexing and searching biomedical literature, in particular for the emerging genomic medicine field. On a subset of articles from PubMed Central, we use text mining methods to extract MeSH terms from abstracts, full texts, and text-based supplementary materials. We find that the recall of MeSH annotations increases by about 5.9 percentage points (+20% on relative percentage) when considering supplementary materials compared to using only abstracts. We further compare the supplementary material annotations with full-text annotations and we find out that the recall of MeSH terms increases by 1.5 percentage point (+3% on relative percentage). Additionally, we analyze genetic variant mentions in abstracts and full-texts and compare them with mentions found in supplementary text-based files. We find that the majority (about 99%) of variants are found in text-based supplementary files. In conclusion, we suggest that supplementary data should receive more attention from the information retrieval community, in particular in life and health sciences.


Assuntos
Medical Subject Headings , Envio de Mensagens de Texto , Mineração de Dados/métodos , PubMed , Registros
11.
Recurso educacional aberto em Espanhol | CVSP - Regional | ID: oer-4052

RESUMO

Descriptores Primarios y Secundarios


Assuntos
Descritores , Medical Subject Headings , LILACS
12.
J Biomed Inform ; 128: 104047, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-35257868

RESUMO

The co-occurrence analysis of Medical Subject Heading (MeSH) terms extracted from the PubMed database is popularly used in bibliometrics. Practically for making the result interpretable, it is necessary to apply a certain filter procedure of co-occurrence matrix for removing the low-frequency items due to their low representativeness. Unfortunately, there is rare research referring to determine a critical threshold to remove the noise of co-occurrence matrix. Here, we proposed a probabilistic model for co-occurrence analysis that can provide statistical inferences about whether the paired items co-occur randomly. With help of this model, the dimensionality of co-occurrence matrix could be reduced according to the selected threshold. The conceptual model framework, simulation and practical applications are illustrated in the manuscript. Further details (including all reproducible codes) can be downloaded from the project website: https://github.com/xizhou/co-occurrence-analysis.git.


Assuntos
Bibliometria , Medical Subject Headings , Análise por Conglomerados , Modelos Estatísticos , PubMed
14.
BMC Med Res Methodol ; 22(1): 79, 2022 03 25.
Artigo em Inglês | MEDLINE | ID: mdl-35337283

RESUMO

BACKGROUND: Deprescribing literature has been increasing rapidly. Our aim was to develop and validate search filters to identify articles on deprescribing in Medline via PubMed and in Embase via Embase.com . METHODS: Articles published from 2011 to 2020 in a core set of eight journals (covering fields of interest for deprescribing, such as geriatrics, pharmacology and primary care) formed a reference set. Each article was screened independently in duplicate and classified as relevant or non-relevant to deprescribing. Relevant terms were identified by term frequency analysis in a 70% subset of the reference set. Selected title and abstract terms, MeSH terms and Emtree terms were combined to develop two highly sensitive filters for Medline via Pubmed and Embase via Embase.com . The filters were validated against the remaining 30% of the reference set. Sensitivity, specificity and precision were calculated with their 95% confidence intervals (95% CI). RESULTS: A total of 23,741 articles were aggregated in the reference set, and 224 were classified as relevant to deprescribing. A total of 34 terms and 4 MeSH terms were identified to develop the Medline search filter. A total of 27 terms and 6 Emtree terms were identified to develop the Embase search filter. The sensitivity was 92% (95% CI: 83-97%) in Medline via Pubmed and 91% (95% CI: 82-96%) in Embase via Embase.com . CONCLUSIONS: These are the first deprescribing search filters that have been developed objectively and validated. These filters can be used in search strategies for future deprescribing reviews. Further prospective studies are needed to assess their effectiveness and efficiency when used in systematic reviews.


Assuntos
Desprescrições , Humanos , MEDLINE , Medical Subject Headings , PubMed , Revisões Sistemáticas como Assunto
15.
BMC Bioinformatics ; 23(1): 56, 2022 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-35105306

RESUMO

BACKGROUND: Besides Boolean retrieval with medical subject headings (MeSH), PubMed provides users with an alternative way called "Related Articles" to access and collect relevant documents based on semantic similarity. To explore the functionality more efficiently and more accurately, we proposed an improved algorithm by measuring the semantic similarity of PubMed citations based on the MeSH-concept network model. RESULTS: Three article similarity networks are obtained using MeSH-concept random walk with restart (MCRWR), MeSH random walk with restart (MRWR) and PubMed related article (PMRA) respectively. The area under receiver operating characteristic (ROC) curve of MCRWR, MRWR and PMRA is 0.93, 0.90, and 0.67 respectively. Precisions of MCRWR and MRWR under various similarity thresholds are higher than that of PMRA. Mean value of P5 of MCRWR is 0.742, which is much higher than those of MRWR (0.692) and PMRA (0.223). In the article semantic similarity network of "Genes & Function of organ & Disease" based on MCRWR algorithm, four topics are identified according to golden standards. CONCLUSION: MeSH-concept random walk with restart algorithm has better performance in constructing article semantic similarity network, which can reveal the implicitly semantic association between documents. The efficiency and accuracy of retrieving semantic-related documents have been improved a lot.


Assuntos
Medical Subject Headings , Web Semântica , Algoritmos , PubMed , Semântica
16.
J Med Libr Assoc ; 110(1): 23-33, 2022 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-35210959

RESUMO

OBJECTIVE: This study compared the recall and precision of MeSH-term versus text-word searching to better understand psychosocial MeSH terms and to provide guidance on whether to include both strategies in an information literacy session or how much time should be spent on teaching each search strategy. METHODS: Using the relevant recall method, a total of 3,162 resources were considered and evaluated to form a gold standard set of 1,521 relevant resources. We compared resources discussing psychosocial aspects of children and adolescents living with type 1 diabetes using two search strategies: text-word strategy versus MeSH-term strategy. The frequency of MeSH terms, the MeSH hierarchy, and elements of each search strategy were also examined. RESULTS: Using the 1,521 relevant articles, we found that the text-word search strategy had 54% recall, while the MeSH-term strategy had 75% recall. Also, the precision of the text-word strategy was 34.4%, while the precision of the MeSH-term strategy was 47.7%. Therefore, the MeSH-term search strategy yielded both greater recall and greater precision. The MeSH strategy was also more complicated in design and usage than the text-word strategy. CONCLUSIONS: This study demonstrates the effectiveness of text-word and MeSH search strategies on precision and recall. The combination of text-word and MeSH strategies is recommended to achieve the most comprehensive results. These results support the idea that MeSH or a similar controlled vocabulary should be taught to experienced and knowledgeable students and practitioners who require a myriad of resources for their literature searches.


Assuntos
Bibliotecas , Medical Subject Headings , Adolescente , Criança , Humanos , Armazenamento e Recuperação da Informação , MEDLINE , Vocabulário Controlado
17.
PLoS One ; 17(2): e0263001, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35139089

RESUMO

The COVID-19 outbreak has posed an unprecedented challenge to humanity and science. On the one side, public and private incentives have been put in place to promptly allocate resources toward research areas strictly related to the COVID-19 emergency. However, research in many fields not directly related to the pandemic has been displaced. In this paper, we assess the impact of COVID-19 on world scientific production in the life sciences and find indications that the usage of medical subject headings (MeSH) has changed following the outbreak. We estimate through a difference-in-differences approach the impact of the start of the COVID-19 pandemic on scientific production using the PubMed database (3.6 Million research papers). We find that COVID-19-related MeSH terms have experienced a 6.5 fold increase in output on average, while publications on unrelated MeSH terms dropped by 10 to 12%. The publication weighted impact has an even more pronounced negative effect (-16% to -19%). Moreover, COVID-19 has displaced clinical trial publications (-24%) and diverted grants from research areas not closely related to COVID-19. Note that since COVID-19 publications may have been fast-tracked, the sudden surge in COVID-19 publications might be driven by editorial policy.


Assuntos
Pesquisa Biomédica , COVID-19 , Bibliometria , Disciplinas das Ciências Biológicas , COVID-19/epidemiologia , Humanos , Medical Subject Headings , PubMed
18.
Semina cienc. biol. saude ; 43(1): 129-152, jan./jun. 2022. ilus, tab
Artigo em Inglês | LILACS | ID: biblio-1354470

RESUMO

This macro-level scientometrics study aimed to analyze the similarities and differences in the scientific communication patterns of the Brazilian postgraduate programs (BPPs) belonging to the Biological Sciences II field (BS2), as defined by Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES). Also, it was identified the most researched diseases and it was discussed their relationship with the needs of Brazilian public health considering the burden of disease (Disability-Adjusted Life Year - DALY, Brazil) estimated by the World Health Organization (WHO). Thus, the scientific production of the BS2's sub-areas Biophysics, Biochemistry, Pharmacology, Physiology, and Morphology was evaluated from 2013 to 2016, through considering the citation impact, Impact Factor (Journal Citation Reports), and scientific collaboration. Data collected included formal information provided to CAPES by all BPPs through the Plataforma Sucupira as well as metadata from Web of Science documents. In addition, were employed the standardized Medical Subject Headings (PubMed) for the analysis of researched diseases. We concluded that the patterns of scientific communication in Biophysics, Biochemistry, Pharmacology, Physiology, and Morphology were predominantly different. Thus, there is a need to consider specificities among the five sub-areas in the evaluation process performed by CAPES. Different approaches are revealed by identifying the most frequently researched diseases and explaining the contributions of each sub-area for Brazilian public health.


Este estudo cientométrico de nível macro teve como objetivo analisar as semelhanças e as diferenças nos padrões de comunicação científica dos programas de pós-graduação brasileiros (PPGs) da área de Ciências Biológicas II, avaliados pela Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES). Além disso, foram identificadas as doenças mais pesquisadas e foi discutido sua relação com as necessidades de saúde pública brasileira, considerando a carga de doenças (Disability-Adjusted Life Year - DALY, Brasil) estimada pela Organização Mundial da Saúde (OMS). Assim, a produção científica das subáreas Biofísica, Bioquímica, Farmacologia, Fisiologia e Morfologia da área de Ciências Biológicas II foi avaliada de 2013 a 2016, considerando o impacto de citações, o Fator de Impacto (Journal Citation Reports) e a colaboração científica. Os dados coletados incluíram informações declaradas à CAPES por todos os PPGs por meio da Plataforma Sucupira, bem como metadados de documentos da Web of Science. Além disso, foram utilizados os cabeçalhos de Medical Subject Headings (PubMed) para a análise das doenças pesquisadas. Concluímos que os padrões de comunicação científica entre as subáreas Biofísica, Bioquímica, Farmacologia, Fisiologia e Morfologia foram predominantemente diferentes. Assim, é necessário considerar as especificidades entre as cinco subáreas no processo de avaliação realizado pela CAPES. Diferentes abordagens são reveladas a partir da identificação das doenças mais pesquisadas e da explicação das contribuições de cada subárea para a saúde pública brasileira.


Assuntos
Humanos , Organização Mundial da Saúde , Disciplinas das Ciências Biológicas , Medical Subject Headings , Fator de Impacto , Metadados , Pós , Bioquímica , Biofísica , Saúde Pública , PubMed
19.
BMC Bioinformatics ; 23(1): 23, 2022 Jan 06.
Artigo em Inglês | MEDLINE | ID: mdl-34991460

RESUMO

BACKGROUND: Ontology-based semantic similarity measures based on SNOMED-CT, MeSH, and Gene Ontology are being extensively used in many applications in biomedical text mining and genomics respectively, which has encouraged the development of semantic measures libraries based on the aforementioned ontologies. However, current state-of-the-art semantic measures libraries have some performance and scalability drawbacks derived from their ontology representations based on relational databases, or naive in-memory graph representations. Likewise, a recent reproducible survey on word similarity shows that one hybrid IC-based measure which integrates a shortest-path computation sets the state of the art in the family of ontology-based semantic measures. However, the lack of an efficient shortest-path algorithm for their real-time computation prevents both their practical use in any application and the use of any other path-based semantic similarity measure. RESULTS: To bridge the two aforementioned gaps, this work introduces for the first time an updated version of the HESML Java software library especially designed for the biomedical domain, which implements the most efficient and scalable ontology representation reported in the literature, together with a new method for the approximation of the Dijkstra's algorithm for taxonomies, called Ancestors-based Shortest-Path Length (AncSPL), which allows the real-time computation of any path-based semantic similarity measure. CONCLUSIONS: We introduce a set of reproducible benchmarks showing that HESML outperforms by several orders of magnitude the current state-of-the-art libraries in the three aforementioned biomedical ontologies, as well as the real-time performance and approximation quality of the new AncSPL shortest-path algorithm. Likewise, we show that AncSPL linearly scales regarding the dimension of the common ancestor subgraph regardless of the ontology size. Path-based measures based on the new AncSPL algorithm are up to six orders of magnitude faster than their exact implementation in large ontologies like SNOMED-CT and GO. Finally, we provide a detailed reproducibility protocol and dataset as supplementary material to allow the exact replication of all our experiments and results.


Assuntos
Ontologias Biológicas , Semântica , Medical Subject Headings , Reprodutibilidade dos Testes , Systematized Nomenclature of Medicine
20.
Stud Health Technol Inform ; 289: 384-387, 2022 Jan 14.
Artigo em Inglês | MEDLINE | ID: mdl-35062172

RESUMO

The National Library of Medicine (NLM) controls and publishes the thesaurus Medical Subject Headings which is used for indexing PubMed. Besides an XML export, the NLM offers a web based MeSH browser. The platform contains English terms. The German Institute for Medical Documentation and Information (DIMDI) partially translated and published these terms. Recently, the German National Library of Medicine (ZB-MED) overtook the translation of MeSH. However, there is no dedicated platform which focuses on MeSH and covers multiple languages. Here, we address this gap, by offering a modern multilingual searchable MeSH browser. A modular platform using open source technology is presented. The frontend enables the user to search and browse terms and switch between different languages. The current version of the presented MeSH browser contains English and German MeSH terms and can be accessed at https://mesh-browser.de.


Assuntos
Medical Subject Headings , Vocabulário Controlado , MEDLINE , National Library of Medicine (U.S.) , PubMed , Estados Unidos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...