Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 3 de 3
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Bioinformatics ; 37(8): 1156-1163, 2021 05 23.
Artigo em Inglês | MEDLINE | ID: mdl-33107905

RESUMO

MOTIVATION: Structured semantic resources, for example, biological knowledge bases and ontologies, formally define biological concepts, entities and their semantic relationships, manifested as structured axioms and unstructured texts (e.g. textual definitions). The resources contain accurate expressions of biological reality and have been used by machine-learning models to assist intelligent applications like knowledge discovery. The current methods use both the axioms and definitions as plain texts in representation learning (RL). However, since the axioms are machine-readable while the natural language is human-understandable, difference in meaning of token and structure impedes the representations to encode desirable biological knowledge. RESULTS: We propose ERBK, a RL model of bio-entities. Instead of using the axioms and definitions as a textual corpus, our method uses knowledge graph embedding method and deep convolutional neural models to encode the axioms and definitions respectively. The representations could not only encode more underlying biological knowledge but also be further applied to zero-shot circumstance where existing approaches fall short. Experimental evaluations show that ERBK outperforms the existing methods for predicting protein-protein interactions and gene-disease associations. Moreover, it shows that ERBK still maintains promising performance under the zero-shot circumstance. We believe the representations and the method have certain generality and could extend to other types of bio-relation. AVAILABILITY AND IMPLEMENTATION: The source code is available at the gitlab repository https://gitlab.com/BioAI/erbk. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Bases de Conhecimento , Aprendizado de Máquina , Humanos , Idioma , Semântica , Software
2.
Bioinformatics ; 36(2): 611-620, 2020 01 15.
Artigo em Inglês | MEDLINE | ID: mdl-31350561

RESUMO

MOTIVATION: A biochemical reaction, bio-event, depicts the relationships between participating entities. Current text mining research has been focusing on identifying bio-events from scientific literature. However, rare efforts have been dedicated to normalize bio-events extracted from scientific literature with the entries in the curated reaction databases, which could disambiguate the events and further support interconnecting events into biologically meaningful and complete networks. RESULTS: In this paper, we propose BioNorm, a novel method of normalizing bio-events extracted from scientific literature to entries in the bio-molecular reaction database, e.g. IntAct. BioNorm considers event normalization as a paraphrase identification problem. It represents an entry as a natural language statement by combining multiple types of information contained in it. Then, it predicts the semantic similarity between the natural language statement and the statements mentioning events in scientific literature using a long short-term memory recurrent neural network (LSTM). An event will be normalized to the entry if the two statements are paraphrase. To the best of our knowledge, this is the first attempt of event normalization in the biomedical text mining. The experiments have been conducted using the molecular interaction data from IntAct. The results demonstrate that the method could achieve F-score of 0.87 in normalizing event-containing statements. AVAILABILITY AND IMPLEMENTATION: The source code is available at the gitlab repository https://gitlab.com/BioAI/leen and BioASQvec Plus is available on figshare https://figshare.com/s/45896c31d10c3f6d857a.


Assuntos
Mineração de Dados , Aprendizado Profundo , Bases de Dados Genéticas , Redes Neurais de Computação , Software
3.
Sci Data ; 9(1): 387, 2022 07 08.
Artigo em Inglês | MEDLINE | ID: mdl-35803960

RESUMO

The study of histopathological phenotypes is vital for cancer research and medicine as it links molecular mechanisms to disease prognosis. It typically involves integration of heterogenous histopathological features in whole-slide images (WSI) to objectively characterize a histopathological phenotype. However, the large-scale implementation of phenotype characterization has been hindered by the fragmentation of histopathological features, resulting from the lack of a standardized format and a controlled vocabulary for structured and unambiguous representation of semantics in WSIs. To fill this gap, we propose the Histopathology Markup Language (HistoML), a representation language along with a controlled vocabulary (Histopathology Ontology) based on Semantic Web technologies. Multiscale features within a WSI, from single-cell features to mesoscopic features, could be represented using HistoML which is a crucial step towards the goal of making WSIs findable, accessible, interoperable and reusable (FAIR). We pilot HistoML in representing WSIs of kidney cancer as well as thyroid carcinoma and exemplify the uses of HistoML representations in semantic queries to demonstrate the potential of HistoML-powered applications for phenotype characterization.


Assuntos
Diagnóstico por Imagem , Terminologia como Assunto , Humanos , Neoplasias Renais/diagnóstico por imagem , Neoplasias Renais/patologia , Web Semântica , Neoplasias da Glândula Tireoide/diagnóstico por imagem , Neoplasias da Glândula Tireoide/patologia , Vocabulário Controlado
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA