Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
1.
Bioinformatics ; 40(9)2024 Sep 02.
Artigo em Inglês | MEDLINE | ID: mdl-39222004

RESUMO

MOTIVATION: Natural language is poised to become a key medium for human-machine interactions in the era of large language models. In the field of biochemistry, tasks such as property prediction and molecule mining are critically important yet technically challenging. Bridging molecular expressions in natural language and chemical language can significantly enhance the interpretability and ease of these tasks. Moreover, it can integrate chemical knowledge from various sources, leading to a deeper understanding of molecules. RESULTS: Recognizing these advantages, we introduce the concept of conversational molecular design, a novel task that utilizes natural language to describe and edit target molecules. To better accomplish this task, we develop ChatMol, a knowledgeable and versatile generative pretrained model. This model is enhanced by incorporating experimental property information, molecular spatial knowledge, and the associations between natural and chemical languages. Several typical solutions including large language models (e.g. ChatGPT) are evaluated, proving the challenge of conversational molecular design and the effectiveness of our knowledge enhancement approach. Case observations and analysis offer insights and directions for further exploration of natural-language interaction in molecular discovery. AVAILABILITY AND IMPLEMENTATION: Codes and data are provided in https://github.com/Ellenzzn/ChatMol/tree/main.


Assuntos
Processamento de Linguagem Natural , Humanos , Software , Biologia Computacional/métodos
2.
Materials (Basel) ; 17(2)2024 Jan 09.
Artigo em Inglês | MEDLINE | ID: mdl-38255495

RESUMO

AlN epilayers were grown on magnetron-sputtered (MS) (11-22) AlN buffers on m-plane sapphire substrates at 1450 °C via hydride vapour phase epitaxy (HVPE). The MS buffers were annealed at high temperatures of 1400-1600 °C. All the samples were characterised using X-ray diffraction, atomic force microscopy, scanning electron microscope and Raman spectrometry. The crystal quality of epilayers regrown by HVPE was improved significantly compared to that of the MS counterpart. With an increasing annealing temperature, the crystal quality of both MS buffers and AlN epilayers measured along [11-23] and [1-100] improved first and then decreased, maybe due to the decomposition of MS buffers, while the corresponding anisotropy along the two directions decreased first and then increased. The optimum quality of the AlN epilayer was obtained at the annealing temperature of around 1500 °C. In addition, it was found that the anisotropy for the epilayers decreased significantly compared to that of annealed MS buffers when the annealing temperature was below 1500 °C.

3.
IEEE Trans Neural Netw Learn Syst ; 34(5): 2271-2283, 2023 May.
Artigo em Inglês | MEDLINE | ID: mdl-34469314

RESUMO

Information diffusion prediction is an important task, which studies how information items spread among users. With the success of deep learning techniques, recurrent neural networks (RNNs) have shown their powerful capability in modeling information diffusion as sequential data. However, previous works focused on either microscopic diffusion prediction, which aims at guessing who will be the next influenced user at what time, or macroscopic diffusion prediction, which estimates the total numbers of influenced users during the diffusion process. To the best of our knowledge, few attempts have been made to suggest a unified model for both microscopic and macroscopic scales. In this article, we propose a novel full-scale diffusion prediction model based on reinforcement learning (RL). RL incorporates the macroscopic diffusion size information into the RNN-based microscopic diffusion model by addressing the nondifferentiable problem. We also employ an effective structural context extraction strategy to utilize the underlying social graph information. Experimental results show that our proposed model outperforms state-of-the-art baseline models on both microscopic and macroscopic diffusion predictions on three real-world datasets.

4.
Chem Sci ; 14(35): 9360-9373, 2023 Sep 13.
Artigo em Inglês | MEDLINE | ID: mdl-37712039

RESUMO

AI has been widely applied in scientific scenarios, such as robots performing chemical synthetic actions to free researchers from monotonous experimental procedures. However, there exists a gap between human-readable natural language descriptions and machine-executable instructions, of which the former are typically in numerous chemical articles, and the latter are currently compiled manually by experts. We apply the latest technology of pre-trained models and achieve automatic transcription between descriptions and instructions. We design a concise and comprehensive schema of instructions and construct an open-source human-annotated dataset consisting of 3950 description-instruction pairs, with 9.2 operations in each instruction on average. We further propose knowledgeable pre-trained transcription models enhanced by multi-grained chemical knowledge. The performance of recent popular models and products showing great capability in automatic writing (e.g., ChatGPT) has also been explored. Experiments prove that our system improves the instruction compilation efficiency of researchers by at least 42%, and can generate fluent academic paragraphs of synthetic descriptions when given instructions, showing the great potential of pre-trained models in improving human productivity.

5.
Nat Commun ; 13(1): 862, 2022 02 14.
Artigo em Inglês | MEDLINE | ID: mdl-35165275

RESUMO

To accelerate biomedical research process, deep-learning systems are developed to automatically acquire knowledge about molecule entities by reading large-scale biomedical data. Inspired by humans that learn deep molecule knowledge from versatile reading on both molecule structure and biomedical text information, we propose a knowledgeable machine reading system that bridges both types of information in a unified deep-learning framework for comprehensive biomedical research assistance. We solve the problem that existing machine reading models can only process different types of data separately, and thus achieve a comprehensive and thorough understanding of molecule entities. By grasping meta-knowledge in an unsupervised fashion within and across different information sources, our system can facilitate various real-world biomedical applications, including molecular property prediction, biomedical relation extraction and so on. Experimental results show that our system even surpasses human professionals in the capability of molecular property comprehension, and also reveal its promising potential in facilitating automatic drug discovery and documentation in the future.


Assuntos
Mineração de Dados , Aprendizado Profundo , Descoberta de Drogas/métodos , Processamento de Linguagem Natural , Leitura , Algoritmos , Pesquisa Biomédica , Processamento Eletrônico de Dados , Humanos , Armazenamento e Recuperação da Informação , Estrutura Molecular
6.
Micromachines (Basel) ; 13(1)2022 Jan 14.
Artigo em Inglês | MEDLINE | ID: mdl-35056294

RESUMO

High-quality AlN film is a key factor affecting the performance of deep-ultraviolet optoelectronic devices. In this work, high-temperature annealing technology in a nitrogen atmosphere was used to improve the quality of AlN films with different polarities grown by magnetron sputtering. After annealing at 1400-1650 °C, the crystal quality of the AlN films was improved. However, there was a gap between the quality of non-polar and polar films. In addition, compared with the semi-polar film, the quality of the non-polar film was more easily improved by annealing. The anisotropy of both the semi-polar and non-polar films decreased with increasing annealing temperature. The results of Raman spectroscopy, scanning electron microscopy and X-ray photoelectron spectroscopy revealed that the annihilation of impurities and grain boundaries during the annealing process were responsible for the improvement of crystal quality and the differences between the films with different polarities.

7.
IEEE Trans Vis Comput Graph ; 28(12): 4980-4994, 2022 12.
Artigo em Inglês | MEDLINE | ID: mdl-35724276

RESUMO

The rapid development of deep natural language processing (NLP) models for text classification has led to an urgent need for a unified understanding of these models proposed individually. Existing methods cannot meet the need for understanding different models in one framework due to the lack of a unified measure for explaining both low-level (e.g., words) and high-level (e.g., phrases) features. We have developed a visual analysis tool, DeepNLPVis, to enable a unified understanding of NLP models for text classification. The key idea is a mutual information-based measure, which provides quantitative explanations on how each layer of a model maintains the information of input words in a sample. We model the intra- and inter-word information at each layer measuring the importance of a word to the final prediction as well as the relationships between words, such as the formation of phrases. A multi-level visualization, which consists of a corpus-level, a sample-level, and a word-level visualization, supports the analysis from the overall training set to individual samples. Two case studies on classification tasks and comparison between models demonstrate that DeepNLPVis can help users effectively identify potential problems caused by samples and model architectures and then make informed improvements.


Assuntos
Gráficos por Computador , Processamento de Linguagem Natural
8.
J Colloid Interface Sci ; 623: 617-626, 2022 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-35598488

RESUMO

Constructing heterostructure is an efficient method to provide more active sites and optimize electronic structure for improving the oxygen evolution reaction (OER) and urea oxidation reaction (UOR) performance. Herein, the 3D FeOOH@Co3O4 heterostructure was constructed using FeOOH layer (10-20 nm) coated on the surface of Co3O4 nanoneedles through the strong hydrolysis of Fe3+. The FeOOH@Co3O4 heterostructure not only retains the nanoneedle structure with open frameworks, but also improves the specific surface area and expedites the charge transfer. The FeOOH@Co3O4-240 heterostructure affords a remarkable OER performance with low overpotential of 228 mV at 10 mA·cm-2 in 1 M KOH solution. The symmetrical urea electrolyzer using FeOOH@Co3O4-240 as both anode and cathode delivers 10 mA/cm2 at 1.43 V. Density functional theory (DFT) calculations unveil that the FeOOH@Co3O4-240 heterostructure could adjust the electronic structure and strengthen the conductivity. This work offered a facile strategy for designing heterojunction catalysts in an economic way.

9.
IEEE Trans Big Data ; 7(1): 81-92, 2021 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-35936829

RESUMO

Country image has a profound influence on international relations and economic development. In the worldwide outbreak of COVID-19, countries and their people display different reactions, resulting in diverse perceived images among foreign public. Therefore, in this article, we take China as a specific and typical case and investigate its image with aspect-based sentiment analysis on a large-scale Twitter dataset. To our knowledge, this is the first study to explore country image in such a fine-grained way. To perform the analysis, we first build a manually-labeled Twitter dataset with aspect-level sentiment annotations. Afterward, we conduct the aspect-based sentiment analysis with BERT to explore the image of China. We discover an overall sentiment change from non-negative to negative in the general public, and explain it with the increasing mentions of negative ideology-related aspects and decreasing mentions of non-negative fact-based aspects. Further investigations into different groups of Twitter users, including U.S. Congress members, English media, and social bots, reveal different patterns in their attitudes toward China. This article provides a deeper understanding of the changing image of China in COVID-19 pandemic. Our research also demonstrates how aspect-based sentiment analysis can be applied in social science researches to deliver valuable insights.

10.
Front Big Data ; 4: 602071, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33817631

RESUMO

Recommender systems aim to provide item recommendations for users and are usually faced with data sparsity problems (e.g., cold start) in real-world scenarios. Recently pre-trained models have shown their effectiveness in knowledge transfer between domains and tasks, which can potentially alleviate the data sparsity problem in recommender systems. In this survey, we first provide a review of recommender systems with pre-training. In addition, we show the benefits of pre-training to recommender systems through experiments. Finally, we discuss several promising directions for future research of recommender systems with pre-training. The source code of our experiments will be available to facilitate future research.

11.
Front Neurorobot ; 13: 93, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31798437

RESUMO

The problem of generating structured Knowledge Graphs (KGs) is difficult and open but relevant to a range of tasks related to decision making and information augmentation. A promising approach is to study generating KGs as a relational representation of inputs (e.g., textual paragraphs or natural images), where nodes represent the entities and edges represent the relations. This procedure is naturally a mixture of two phases: extracting primary relations from input, and completing the KG with reasoning. In this paper, we propose a hybrid KG builder that combines these two phases in a unified framework and generates KGs from scratch. Specifically, we employ a neural relation extractor resolving primary relations from input and a differentiable inductive logic programming (ILP) model that iteratively completes the KG. We evaluate our framework in both textual and visual domains and achieve comparable performance on relation extraction datasets based on Wikidata and the Visual Genome. The framework surpasses neural baselines by a noticeable gap in reasoning out dense KGs and overall performs particularly well for rare relations.

12.
PLoS One ; 10(4): e0118437, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-25874581

RESUMO

Combined with neural language models, distributed word representations achieve significant advantages in computational linguistics and text mining. Most existing models estimate distributed word vectors from large-scale data in an unsupervised fashion, which, however, do not take rich linguistic knowledge into consideration. Linguistic knowledge can be represented as either link-based knowledge or preference-based knowledge, and we propose knowledge regularized word representation models (KRWR) to incorporate these prior knowledge for learning distributed word representations. Experiment results demonstrate that our estimated word representation achieves better performance in task of semantic relatedness ranking. This indicates that our methods can efficiently encode both prior knowledge from knowledge bases and statistical knowledge from large-scale text corpora into a unified word representation model, which will benefit many tasks in text mining.


Assuntos
Bases de Conhecimento , Linguística/métodos , Algoritmos , Mineração de Dados , Humanos , Idioma , Modelos Teóricos , Processamento de Linguagem Natural , Semântica , Aprendizagem Verbal
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA