Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 151
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Bioinformatics ; 40(7)2024 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-38917409

RESUMEN

MOTIVATION: Biomedical relation extraction at the document level (Bio-DocRE) involves extracting relation instances from biomedical texts that span multiple sentences, often containing various entity concepts such as genes, diseases, chemicals, variants, etc. Currently, this task is usually implemented based on graphs or transformers. However, most work directly models entity features to relation prediction, ignoring the effectiveness of entity pair information as an intermediate state for relation prediction. In this article, we decouple this task into a three-stage process to capture sufficient information for improving relation prediction. RESULTS: We propose an innovative framework HTGRS for Bio-DocRE, which constructs a hierarchical tree graph (HTG) to integrate key information sources in the document, achieving relation reasoning based on entity. In addition, inspired by the idea of semantic segmentation, we conceptualize the task as a table-filling problem and develop a relation segmentation (RS) module to enhance relation reasoning based on the entity pair. Extensive experiments on three datasets show that the proposed framework outperforms the state-of-the-art methods and achieves superior performance. AVAILABILITY AND IMPLEMENTATION: Our source code is available at https://github.com/passengeryjy/HTGRS.


Asunto(s)
Algoritmos , Minería de Datos , Minería de Datos/métodos , Semántica , Biología Computacional/métodos , Procesamiento de Lenguaje Natural , Humanos
2.
Nano Lett ; 2024 Mar 21.
Artículo en Inglés | MEDLINE | ID: mdl-38511842

RESUMEN

Methane oxidation using molecular oxygen remains a grand challenge in which the obstacle is not only the activation of methane but also the reaction with oxygen, considering the mismatch of the ground spin states. Herein, we report TiO2-supported Pt nanocrystals (Pt/TiO2) with surface Pt-Ti alloyed layers that directly convert methane into oxygenates by using O2 as the oxidant with the assistance of CO. The oxygenate yield reached 749.8 mmol gPt-1 in a H2O aqueous solution over 0.1% Pt/TiO2 under 31 bar of mixed gas (20:5:6 CH4:CO:O2) at 150 °C for 3 h, while the CH3OH selectivity was 62.3%. On the basis of the control experiments and spectroscopic results, we identified the surface Pt-Ti alloy as the active sites. Moreover, CO promoted the dissociation of O2 on the surface of Pt-Ti alloyed layers and the subsequent activation of CH4 to form oxygenated products.

3.
Small ; 20(15): e2308278, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38009756

RESUMEN

Designing cost-efffective electrocatalysts for the oxygen evolution reaction (OER) holds significant importance in the progression of clean energy generation and efficient energy storage technologies, such as water splitting and rechargeable metal-air batteries. In this work, an OER electrocatalyst is developed using Ni and Fe precursors in combination with different proportions of graphene oxide. The catalyst synthesis involved a rapid reduction process, facilitated by adding sodium borohydride, which successfully formed NiFe nanoparticle nests on graphene support (NiFe NNG). The incorporation of graphene support enhances the catalytic activity, electron transferability, and electrical conductivity of the NiFe-based catalyst. The NiFe NNG catalyst exhibits outstanding performance, characterized by a low overpotential of 292.3 mV and a Tafel slope of 48 mV dec-1, achieved at a current density of 10 mA cm- 2. Moreover, the catalyst exhibits remarkable stability over extended durations. The OER performance of NiFe NNG is on par with that of commercial IrO2 in alkaline media. Such superb OER catalytic performance can be attributed to the synergistic effect between the NiFe nanoparticle nests and graphene, which arises from their large surface area and outstanding intrinsic catalytic activity. The excellent electrochemical properties of NiFe NNG hold great promise for further applications in energy storage and conversion devices.

4.
Small ; : e2401798, 2024 May 03.
Artículo en Inglés | MEDLINE | ID: mdl-38700074

RESUMEN

The covalent organic frameworks (COFs) possessing high crystallinity and capability to capture low-concentration CO2 (400 ppm) from air are still underdeveloped. The challenge lies in simultaneously incorporating high-density active sites for CO2 insertion and maintaining the ordered structure. Herein, a structure engineering approach is developed to afford an ionic pair-functionalized crystalline and stable fluorinated COF (F-COF) skeleton. The ordered structure of the F-COF is well maintained after the integration of abundant basic fluorinated alcoholate anions, as revealed by synchrotron X-ray scattering experiments. The breakthrough test demonstrates its attractive performance in capturing (400 ppm) CO2 from gas mixtures via O─C bond formation, as indicated by the in situ spectroscopy and operando nuclear magnetic resonance spectroscopy using 13C-labeled CO2 sources. Both theoretical and experimental thermodynamic studies reveal the reaction enthalpy of ≈-40 kJ mol-1 between CO2 and the COF scaffolds. This implies weaker interaction strength compared with state-of-the-art amine-derived sorbents, thus allowing complete CO2 release with less energy input. The structure evolution study from synchrotron X-ray scattering and small-angle neutron scattering confirms the well-maintained crystalline patterns after CO2 insertion. The as-developed proof-of-concept approach provides guidance on anchoring binding sites for direct air capture (DAC) of CO2 in crystalline scaffolds.

5.
Bioinformatics ; 39(8)2023 08 01.
Artículo en Inglés | MEDLINE | ID: mdl-37549065

RESUMEN

MOTIVATION: Few-shot learning that can effectively perform named entity recognition in low-resource scenarios has raised growing attention, but it has not been widely studied yet in the biomedical field. In contrast to high-resource domains, biomedical named entity recognition (BioNER) often encounters limited human-labeled data in real-world scenarios, leading to poor generalization performance when training only a few labeled instances. Recent approaches either leverage cross-domain high-resource data or fine-tune the pre-trained masked language model using limited labeled samples to generate new synthetic data, which is easily stuck in domain shift problems or yields low-quality synthetic data. Therefore, in this article, we study a more realistic scenario, i.e. few-shot learning for BioNER. RESULTS: Leveraging the domain knowledge graph, we propose knowledge-guided instance generation for few-shot BioNER, which generates diverse and novel entities based on similar semantic relations of neighbor nodes. In addition, by introducing question prompt, we cast BioNER as question-answering task and propose prompt contrastive learning to improve the robustness of the model by measuring the mutual information between query-answer pairs. Extensive experiments conducted on various few-shot settings show that the proposed framework achieves superior performance. Particularly, in a low-resource scenario with only 20 samples, our approach substantially outperforms recent state-of-the-art models on four benchmark datasets, achieving an average improvement of up to 7.1% F1. AVAILABILITY AND IMPLEMENTATION: Our source code and data are available at https://github.com/cpmss521/KGPC.


Asunto(s)
Aprendizaje Profundo , Humanos , Programas Informáticos , Semántica , Benchmarking
6.
Methods ; 216: 3-10, 2023 08.
Artículo en Inglés | MEDLINE | ID: mdl-37302520

RESUMEN

As an important task of natural language processing, medication recommendation aims to recommend medication combinations according to the electronic health record, which can also be regarded as a multi-label classification task. But patients often have multiple diseases simultaneously, and the model must consider drug-drug interactions (DDI) of medication combinations when recommending medications, making medication recommendation more difficult. There is little existing work to explore the changes in patient conditions. However, these changes may point to future trends in patient conditions that are critical for reducing DDI rates in recommended drug combinations. In this paper, we proposed the Patient Information Mining Network (PIMNet), which models the current core medications of patient by mining the temporal and spatial changes of patient medication order and patient condition vector, and allocates some auxiliary medications as the currently recommended medication combination. The experimental results show that the proposed model greatly reduces the recommended DDI of medications while achieving results no lower than the state-of-the-art results.


Asunto(s)
Minería de Datos , Interacciones Farmacológicas , Humanos , Combinación de Medicamentos
7.
J Biomed Inform ; 156: 104676, 2024 Jun 12.
Artículo en Inglés | MEDLINE | ID: mdl-38876451

RESUMEN

Biomedical relation extraction has long been considered a challenging task due to the specialization and complexity of biomedical texts. Syntactic knowledge has been widely employed in existing research to enhance relation extraction, providing guidance for the semantic understanding and text representation of models. However, the utilization of syntactic knowledge in most studies is not exhaustive, and there is often a lack of fine-grained noise reduction, leading to confusion in relation classification. In this paper, we propose an attention generator that comprehensively considers both syntactic dependency type information and syntactic position information to distinguish the importance of different dependency connections. Additionally, we integrate positional information, dependency type information, and word representations together to introduce location-enhanced syntactic knowledge for guiding our biomedical relation extraction. Experimental results on three widely used English benchmark datasets in the biomedical domain consistently outperform a range of baseline models, demonstrating that our approach not only makes full use of syntactic knowledge but also effectively reduces the impact of noisy words.

8.
J Phys Chem A ; 2024 Jul 11.
Artículo en Inglés | MEDLINE | ID: mdl-38991133

RESUMEN

Polyethylene terephthalate (PET) is a type of polymer frequently used in plastic packaging that significantly adds the amount of plastic waste found in landfills. One of the ways to recover valuable raw materials from postconsumer plastic is by depolymerizing PET into its monomeric constituents, which are dimethyl terephthalate (DMT) and ethylene glycol. PET depolymerization is often done in methanolysis with the help of acidic or base catalysts. Tertiary amine is one of the most attractive base catalysts for PET depolymerization in methanolysis since it does not lead to the generation of potentially environmentally harmful waste, unlike metal-based catalysts. However, the mechanism by which tertiary amines catalyze PET depolymerization in methanolysis remains unexplored. Developing a detailed mechanistic understanding of this process is important for improving plastic upcycling since it opens the possibility of employing various cheaper and more environmentally friendly reaction conditions. Using density functional theory and transition state analysis, we show that in the presence of tertiary amine catalysts, methanolysis of PET consists of multiple discrete-step reactions rather than a single concerted step. Furthermore, by comparing our calculations to recent experimental results, we were able to rationalize the DMT yield from the depolymerization process by relating it to charge polarization within tertiary amine catalysts, thus opening a pathway to identify atomic descriptors for future catalyst design.

9.
Small ; 19(41): e2302708, 2023 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-37317018

RESUMEN

Direct air capture (DAC) of CO2 has emerged as the most promising "negative carbon emission" technologies. Despite being state-of-the-art, sorbents deploying alkali hydroxides/amine solutions or amine-modified materials still suffer from unsolved high energy consumption and stability issues. In this work, composite sorbents are crafted by hybridizing a robust metal-organic framework (Ni-MOF) with superbase-derived ionic liquid (SIL), possessing well maintained crystallinity and chemical structures. The low-pressure (0.4 mbar) volumetric CO2 capture assessment and a fixed-bed breakthrough examination with 400 ppm CO2 gas flow reveal high-performance DAC of CO2 (CO2 uptake capacity of up to 0.58 mmol g-1 at 298 K) and exceptional cycling stability. Operando spectroscopy analysis reveals the rapid (400 ppm) CO2 capture kinetics and energy-efficient/fast CO2 releasing behaviors. The theoretical calculation and small-angle X-ray scattering demonstrate that the confinement effect of the MOF cavity enhances the interaction strength of reactive sites in SIL with CO2 , indicating great efficacy of the hybridization. The achievements in this study showcase the exceptional capabilities of SIL-derived sorbents in carbon capture from ambient air in terms of rapid carbon capture kinetics, facile CO2 releasing, and good cycling performance.

10.
Methods ; 198: 3-10, 2022 02.
Artículo en Inglés | MEDLINE | ID: mdl-34562584

RESUMEN

The coronavirus disease 2019 (COVID-19) has outbreak since early December 2019, and COVID-19 has caused over 100 million cases and 2 million deaths around the world. After one year of the COVID-19 outbreak, there is no certain and approve medicine against it. Drug repositioning has become one line of scientific research that is being pursued to develop an effective drug. However, due to the lack of COVID-19 data, there is still no specific drug repositioning targeting the COVID-19. In this paper, we propose a framework for COVID-19 drug repositioning. This framework has several advantages that can be exploited: one is that a local graph aggregating representation is used across a heterogeneous network to address the data sparsity problem; another is the multi-hop neighbors of the heterogeneous graph are aggregated to recall as many COVID-19 potential drugs as possible. Our experimental results show that our COVDR framework performs significantly better than baseline methods, and the docking simulation verifies that our three potential drugs have the ability to against COVID-19 disease.


Asunto(s)
COVID-19 , Preparaciones Farmacéuticas , Antivirales , Reposicionamiento de Medicamentos , Humanos , Simulación del Acoplamiento Molecular , SARS-CoV-2
11.
J Biomed Inform ; 140: 104317, 2023 04.
Artículo en Inglés | MEDLINE | ID: mdl-36804374

RESUMEN

Named entity recognition is a key task in text mining. In the biomedical field, entity recognition focuses on extracting key information from large-scale biomedical texts for the downstream information extraction task. Biomedical literature contains a large amount of long-dependent text, and previous studies use external syntactic parsing tools to capture word dependencies in sentences to achieve nested biomedical entity recognition. However, the addition of external parsing tools often introduces unnecessary noise to the current auxiliary task and cannot improve the performance of entity recognition in an end-to-end way. Therefore, we propose a novel automatic dependency parsing approach, namely the ADPG model, to fuse syntactic structure information in an end-to-end way to recognize biomedical entities. Specifically, the method is based on a multilayer Tree-Transformer structure to automatically extract the semantic representation and syntactic structure in long-dependent sentences, and then combines a multilayer graph attention neural network (GAT) to extract the dependency paths between words in the syntactic structure to improve the performance of biomedical entity recognition. We evaluated our ADPG model on three biomedical domain and one news domain datasets, and the experimental results demonstrate that our model achieves state-of-the-art results on these four datasets with certain generalization performance. Our model is released on GitHub: https://github.com/Yumeng-Y/ADPG.


Asunto(s)
Adenosina Difosfato Glucosa , Minería de Datos , Minería de Datos/métodos , Redes Neurales de la Computación , Semántica
12.
J Biomed Inform ; 147: 104503, 2023 11.
Artículo en Inglés | MEDLINE | ID: mdl-37778673

RESUMEN

Predicting relationships between biological entities can greatly benefit important biomedical problems. Previous studies have attempted to represent biological entities and relationships in Euclidean space using embedding methods, which evaluate their semantic similarity by representing entities as numerical vectors. However, the limitation of these methods is that they cannot prevent the loss of latent hierarchical information when embedding large graph-structured data into Euclidean space, and therefore cannot capture the semantics of entities and relationships accurately. Hyperbolic spaces, such as Poincaré ball, are better suited for hierarchical modeling than Euclidean spaces. This is because hyperbolic spaces exhibit negative curvature, causing distances to grow exponentially as they approach the boundary. In this paper, we propose HEM, a hyperbolic hierarchical knowledge graph embedding model to generate vector representations of bio-entities. By encoding the entities and relations in the hyperbolic space, HEM can capture latent hierarchical information and improve the accuracy of biological entity representation. Notably, HEM can preserve rich information with a low dimension compared with the methods that encode entities in Euclidean space. Furthermore, we explore the performance of HEM in protein-protein interaction prediction and gene-disease association prediction tasks. Experimental results demonstrate the superior performance of HEM over state-of-the-art baselines. The data and code are available at : https://github.com/Nan-ll/HEM.


Asunto(s)
Conocimiento , Reconocimiento de Normas Patrones Automatizadas , Semántica
13.
J Biomed Inform ; 145: 104459, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-37531999

RESUMEN

Document-level relation extraction is designed to recognize connections between entities a cross sentences or between sentences. The current mainstream document relation extraction model is mainly based on the graph method or combined with the pre-trained language model, which leads to the relatively complex process of the whole workflow. In this work, we propose biomedical relation extraction based on prompt learning to avoid complex relation extraction processes and obtain decent performance. Particularity, we present a model that combines prompt learning with T5 for document relation extraction, by integrating a mask template mechanism into the model. In addition, this work also proposes a few-shot relation extraction method based on the K-nearest neighbor (KNN) algorithm with prompt learning. We select similar semantic labels through KNN, and subsequently conduct the relation extraction. The results acquired from two biomedical document benchmarks indicate that our model can improve the learning of document semantic information, achieving improvements in the relation F1 score of 3.1% on CDR.


Asunto(s)
Algoritmos , Semántica , Lenguaje , Aprendizaje , Procesamiento de Lenguaje Natural
14.
J Biomed Inform ; 125: 103956, 2022 01.
Artículo en Inglés | MEDLINE | ID: mdl-34848329

RESUMEN

Extracting entities and their relations from unstructured literature to form structured triplets is essential for biomedical knowledge extraction. Because sentences in biomedical datasets usually have many special overlapping triplets, it is difficult to use previous work to extract these triplets effectively. In this work, we propose a novel tagging strategy to achieve joint extraction in the machine reading comprehension framework. On the one hand, our method uses Query in the machine reading comprehension framework to introduce the information of the specific relation. On the other hand, our method introduces a tagging strategy for overlapping triplets in the biomedical domain. We use CHEMPROT and DDIExtraction2013 datasets to evaluate our method. The experimental results demonstrate that our proposed method can enhance the model's ability to deal with overlapping triplets, improving extraction performance.


Asunto(s)
Comprensión , Minería de Datos , Lenguaje , Publicaciones , Proyectos de Investigación
15.
Environ Res ; 214(Pt 2): 113863, 2022 11.
Artículo en Inglés | MEDLINE | ID: mdl-35841969

RESUMEN

Pollution of phenolic effluent from spice and plastics factories has become increasingly serious. Thus, developing a green and highly efficient adsorbent to remove phenolic compounds from wastewater is of urgent need. In this study, cellulose graft copolymer was synthesized through grafting 4-vinylpyridine monomer and polyethylene glycol methacrylate to a molecular skeleton of cellulose by free radical polymerization. The supramolecular hydrogel was successfully synthesized by physical cross-linking of cellulose graft copolymer and α-cyclodextrin. These supramolecular hydrogels were thoroughly characterized and the adsorption performance (adsorption isotherms and adsorption kinetics) of phenol on the supramolecular hydrogel were investigated in batch operation. The supramolecular hydrogel not only exhibited excellent adsorption of phenol, but also demonstrated increased mechanical strength due to the introduction of a modified cellulose base material. The adsorption kinetics of phenol on the supramolecular hydrogel followed a quasi-second-order reaction, with a correlation coefficient of 0.9909. The adsorption isotherm conformed to the Langmuir adsorption isotherm, and the maximum adsorption capacity of phenol can reach 80.71 mg g-1, which was 2-3 times higher than traditional carbon-based materials. The results demonstrate the great promise of the waste-derived supramolecular hydrogel to be used as an efficient adsorbent in wastewater treatment.


Asunto(s)
Contaminantes Químicos del Agua , Purificación del Agua , Adsorción , Celulosa , Hidrogeles , Concentración de Iones de Hidrógeno , Cinética , Fenol , Fenoles , Polímeros , Aguas Residuales , Agua , Purificación del Agua/métodos
16.
BMC Bioinformatics ; 22(Suppl 1): 602, 2021 Dec 17.
Artículo en Inglés | MEDLINE | ID: mdl-34920700

RESUMEN

BACKGROUND: The recognition of pharmacological substances, compounds and proteins is essential for biomedical relation extraction, knowledge graph construction, drug discovery, as well as medical question answering. Although considerable efforts have been made to recognize biomedical entities in English texts, to date, only few limited attempts were made to recognize them from biomedical texts in other languages. PharmaCoNER is a named entity recognition challenge to recognize pharmacological entities from Spanish texts. Because there are currently abundant resources in the field of natural language processing, how to leverage these resources to the PharmaCoNER challenge is a meaningful study. METHODS: Inspired by the success of deep learning with language models, we compare and explore various representative BERT models to promote the development of the PharmaCoNER task. RESULTS: The experimental results show that deep learning with language models can effectively improve model performance on the PharmaCoNER dataset. Our method achieves state-of-the-art performance on the PharmaCoNER dataset, with a max F1-score of 92.01%. CONCLUSION: For the BERT models on the PharmaCoNER dataset, biomedical domain knowledge has a greater impact on model performance than the native language (i.e., Spanish). The BERT models can obtain competitive performance by using WordPiece to alleviate the out of vocabulary limitation. The performance on the BERT model can be further improved by constructing a specific vocabulary based on domain knowledge. Moreover, the character case also has a certain impact on model performance.


Asunto(s)
Aprendizaje Profundo
17.
Bioinformatics ; 36(15): 4323-4330, 2020 08 01.
Artículo en Inglés | MEDLINE | ID: mdl-32399565

RESUMEN

MOTIVATION: The biomedical literature contains a wealth of chemical-protein interactions (CPIs). Automatically extracting CPIs described in biomedical literature is essential for drug discovery, precision medicine, as well as basic biomedical research. Most existing methods focus only on the sentence sequence to identify these CPIs. However, the local structure of sentences and external biomedical knowledge also contain valuable information. Effective use of such information may improve the performance of CPI extraction. RESULTS: In this article, we propose a novel neural network-based approach to improve CPI extraction. Specifically, the approach first employs BERT to generate high-quality contextual representations of the title sequence, instance sequence and knowledge sequence. Then, the Gaussian probability distribution is introduced to capture the local structure of the instance. Meanwhile, the attention mechanism is applied to fuse the title information and biomedical knowledge, respectively. Finally, the related representations are concatenated and fed into the softmax function to extract CPIs. We evaluate our proposed model on the CHEMPROT corpus. Our proposed model is superior in performance as compared with other state-of-the-art models. The experimental results show that the Gaussian probability distribution and external knowledge are complementary to each other. Integrating them can effectively improve the CPI extraction performance. Furthermore, the Gaussian probability distribution can effectively improve the extraction performance of sentences with overlapping relations in biomedical relation extraction tasks. AVAILABILITY AND IMPLEMENTATION: Data and code are available at https://github.com/CongSun-dlut/CPI_extraction. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Minería de Datos , Redes Neurales de la Computación , Probabilidad , Publicaciones , Proyectos de Investigación
18.
J Biomed Inform ; 123: 103896, 2021 11.
Artículo en Inglés | MEDLINE | ID: mdl-34487887

RESUMEN

Adverse drug reaction (ADR) detection is an important issue in drug safety. ADRs are health threats caused by medication. Identifying ADRs in a timely manner can reduce harm to patients and can also assist doctors in the rational use of drugs. Many studies have investigated potential ADRs based on social media due to the openness and timeliness of this resource; however, they have ignored the fine-grained emotional expression in social media text. In addition, the benchmark datasets from social media are usually small, which can result in the problem of over-fitting. In this paper, we propose the Adversarial Neural Network with Sentiment-aware Attention (ANNSA) model, which enhances the sentimental element in social media and improves the performance of neural networks via data augmentation. Specifically, a sentiment-aware attention mechanism is proposed to extract the word-level sentiment features associated with sentiment words and learn task-related information by optimizing a task-specific loss. For low-resource datasets, we use an adversarial training approach to generate perturbations of the word embeddings via an implicit regularization technique. ANNSA was tested on three social media ADR detection datasets, namely, Twitter, TwiMed (Twitter) and CADEC. The experimental results indicated the ability to achieve F1 values of 48.84%, 64.18% and 83.06%, respectively, comparable to the best results reported for state-of-the-art methods. Our study demonstrates that sentiment words are highly correlated with ADRs and that word-level sentiment features can assist in detecting ADRs from social media datasets.


Asunto(s)
Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos , Medios de Comunicación Sociales , Actitud , Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos/diagnóstico , Humanos , Aprendizaje Automático , Redes Neurales de la Computación , Farmacovigilancia
19.
J Biomed Inform ; 118: 103799, 2021 06.
Artículo en Inglés | MEDLINE | ID: mdl-33965638

RESUMEN

Recognition of biomedical entities from literature is a challenging research focus, which is the foundation for extracting a large amount of biomedical knowledge existing in unstructured texts into structured formats. Using the sequence labeling framework to implement biomedical named entity recognition (BioNER) is currently a conventional method. This method, however, often cannot take full advantage of the semantic information in the dataset, and the performance is not always satisfactory. In this work, instead of treating the BioNER task as a sequence labeling problem, we formulate it as a machine reading comprehension (MRC) problem. This formulation can introduce more prior knowledge utilizing well-designed queries, and no longer need decoding processes such as conditional random fields (CRF). We conduct experiments on six BioNER datasets, and the experimental results demonstrate the effectiveness of our method. Our method achieves state-of-the-art (SOTA) performance on the BC4CHEMD, BC5CDR-Chem, BC5CDR-Disease, NCBI-Disease, BC2GM and JNLPBA datasets, achieving F1-scores of 92.92%, 94.19%, 87.83%, 90.04%, 85.48% and 78.93%, respectively.


Asunto(s)
Comprensión , Semántica , Cognición , Minería de Datos
20.
BMC Med Inform Decis Mak ; 21(Suppl 2): 55, 2021 07 30.
Artículo en Inglés | MEDLINE | ID: mdl-34330264

RESUMEN

BACKGROUND: Clinical notes record the health status, clinical manifestations and other detailed information of each patient. The International Classification of Diseases (ICD) codes are important labels for electronic health records. Automatic medical codes assignment to clinical notes through the deep learning model can not only improve work efficiency and accelerate the development of medical informatization but also facilitate the resolution of many issues related to medical insurance. Recently, neural network-based methods have been proposed for the automatic medical code assignment. However, in the medical field, clinical notes are usually long documents and contain many complex sentences, most of the current methods cannot effective in learning the representation of potential features from document text. METHODS: In this paper, we propose a hybrid capsule network model. Specifically, we use bi-directional LSTM (Bi-LSTM) with forwarding and backward directions to merge the information from both sides of the sequence. The label embedding framework embeds the text and labels together to leverage the label information. We then use a dynamic routing algorithm in the capsule network to extract valuable features for medical code prediction task. RESULTS: We applied our model to the task of automatic medical codes assignment to clinical notes and conducted a series of experiments based on MIMIC-III data. The experimental results show that our method achieves a micro F1-score of 67.5% on MIMIC-III dataset, which outperforms the other state-of-the-art methods. CONCLUSIONS: The proposed model employed the dynamic routing algorithm and label embedding framework can effectively capture the important features across sentences. Both Capsule networks and domain knowledge are helpful for medical code prediction task.


Asunto(s)
Clasificación Internacional de Enfermedades , Redes Neurales de la Computación , Algoritmos , Registros Electrónicos de Salud , Humanos , Lenguaje
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA