Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 33
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
Neural Netw ; 176: 106354, 2024 Apr 30.
Artigo em Inglês | MEDLINE | ID: mdl-38723308

RESUMO

Neural operators, as a powerful approximation to the non-linear operators between infinite-dimensional function spaces, have proved to be promising in accelerating the solution of partial differential equations (PDE). However, it requires a large amount of simulated data, which can be costly to collect. This can be avoided by learning physics from the physics-constrained loss, which we refer to it as mean squared residual (MSR) loss constructed by the discretized PDE. We investigate the physical information in the MSR loss, which we called long-range entanglements, and identify the challenge that the neural network requires the capacity to model the long-range entanglements in the spatial domain of the PDE, whose patterns vary in different PDEs. To tackle the challenge, we propose LordNet, a tunable and efficient neural network for modeling various entanglements. Inspired by the traditional solvers, LordNet models the long-range entanglements with a series of matrix multiplications, which can be seen as the low-rank approximation to the general fully-connected layers and extracts the dominant pattern with reduced computational cost. The experiments on solving Poisson's equation and (2D and 3D) Navier-Stokes equation demonstrate that the long-range entanglements from the MSR loss can be well modeled by the LordNet, yielding better accuracy and generalization ability than other neural networks. The results show that the Lordnet can be 40× faster than traditional PDE solvers. In addition, LordNet outperforms other modern neural network architectures in accuracy and efficiency with the smallest parameter size.

2.
Artigo em Inglês | MEDLINE | ID: mdl-38421846

RESUMO

Randomness is widely introduced in neural network training to simplify model optimization or avoid the over-fitting problem. Among them, dropout and its variations in different aspects (e.g., data, model structure) are prevalent in regularizing the training of deep neural networks. Though effective and performing well, the randomness introduced by these dropout-based methods causes nonnegligible inconsistency between training and inference. In this paper, we introduce a simple consistency training strategy to regularize such randomness, namely R-Drop, which forces two output distributions sampled by each type of randomness to be consistent. Specifically, R-Drop minimizes the bidirectional KL-divergence between two output distributions produced by dropout-based randomness for each training sample. Theoretical analysis reveals that R-Drop can reduce the above inconsistency by reducing the inconsistency among the sampled sub structures and bridging the gap between the loss calculated by the full model and sub structures. Experiments on 7 widely-used deep learning tasks ( 23 datasets in total) demonstrate that R-Drop is universally effective for different types of neural networks (i.e., feed-forward, recurrent, and graph neural networks) and different learning paradigms (supervised, parameter-efficient, and semi-supervised). In particular, it achieves state-of-the-art performances with the vanilla Transformer model on WMT14 English → German translation ( 30.91 BLEU) and WMT14 English → French translation ( 43.95 BLEU), even surpassing models trained with extra large-scale data and expert-designed advanced variants of Transformer models. Our code is available at GitHub https://github.com/dropreg/R-Drop.

3.
IEEE Trans Pattern Anal Mach Intell ; 46(6): 4234-4245, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38241115

RESUMO

Text-to-speech (TTS) has made rapid progress in both academia and industry in recent years. Some questions naturally arise that whether a TTS system can achieve human-level quality, how to define/judge that quality, and how to achieve it. In this paper, we answer these questions by first defining the human-level quality based on the statistical significance of subjective measure and introducing appropriate guidelines to judge it, and then developing a TTS system called NaturalSpeech that achieves human-level quality on benchmark datasets. Specifically, we leverage a variational auto-encoder (VAE) for end-to-end text-to-waveform generation, with several key modules to enhance the capacity of the prior from text and reduce the complexity of the posterior from speech, including phoneme pre-training, differentiable duration modeling, bidirectional prior/posterior modeling, and a memory mechanism in VAE. Experimental evaluations on the popular LJSpeech dataset show that our proposed NaturalSpeech achieves -0.01 CMOS (comparative mean opinion score) to human recordings at the sentence level, with Wilcoxon signed rank test at p-level p >> 0.05, which demonstrates no statistically significant difference from human recordings for the first time.


Assuntos
Algoritmos , Humanos , Processamento de Sinais Assistido por Computador , Fala/fisiologia , Processamento de Linguagem Natural , Bases de Dados Factuais , Espectrografia do Som/métodos
4.
Nat Commun ; 15(1): 313, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-38182565

RESUMO

Geometric deep learning has been revolutionizing the molecular modeling field. Despite the state-of-the-art neural network models are approaching ab initio accuracy for molecular property prediction, their applications, such as drug discovery and molecular dynamics (MD) simulation, have been hindered by insufficient utilization of geometric information and high computational costs. Here we propose an equivariant geometry-enhanced graph neural network called ViSNet, which elegantly extracts geometric features and efficiently models molecular structures with low computational costs. Our proposed ViSNet outperforms state-of-the-art approaches on multiple MD benchmarks, including MD17, revised MD17 and MD22, and achieves excellent chemical property prediction on QM9 and Molecule3D datasets. Furthermore, through a series of simulations and case studies, ViSNet can efficiently explore the conformational space and provide reasonable interpretability to map geometric representations to molecular structures.

5.
Brief Bioinform ; 24(6)2023 09 22.
Artigo em Inglês | MEDLINE | ID: mdl-37903413

RESUMO

Accurate prediction of drug-target affinity (DTA) is of vital importance in early-stage drug discovery, facilitating the identification of drugs that can effectively interact with specific targets and regulate their activities. While wet experiments remain the most reliable method, they are time-consuming and resource-intensive, resulting in limited data availability that poses challenges for deep learning approaches. Existing methods have primarily focused on developing techniques based on the available DTA data, without adequately addressing the data scarcity issue. To overcome this challenge, we present the Semi-Supervised Multi-task training (SSM) framework for DTA prediction, which incorporates three simple yet highly effective strategies: (1) A multi-task training approach that combines DTA prediction with masked language modeling using paired drug-target data. (2) A semi-supervised training method that leverages large-scale unpaired molecules and proteins to enhance drug and target representations. This approach differs from previous methods that only employed molecules or proteins in pre-training. (3) The integration of a lightweight cross-attention module to improve the interaction between drugs and targets, further enhancing prediction accuracy. Through extensive experiments on benchmark datasets such as BindingDB, DAVIS and KIBA, we demonstrate the superior performance of our framework. Additionally, we conduct case studies on specific drug-target binding activities, virtual screening experiments, drug feature visualizations and real-world applications, all of which showcase the significant potential of our work. In conclusion, our proposed SSM-DTA framework addresses the data limitation challenge in DTA prediction and yields promising results, paving the way for more efficient and accurate drug discovery processes.


Assuntos
Benchmarking , Descoberta de Drogas , Sistemas de Liberação de Medicamentos
6.
Phys Rev E ; 108(2-2): 025305, 2023 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-37723802

RESUMO

The numerical determination of solitary states is an important topic for such research areas as Bose-Einstein condensates, nonlinear optics, plasma physics, and so on. In this paper, we propose a data-driven approach for identifying solitons based on dynamical solutions of real-time differential equations. Our approach combines a machine-learning architecture called the complex-valued neural operator (CNO) with an energy-restricted gradient optimization. The CNO serves as a generalization of the traditional neural operator to the complex domain, and constructs a smooth mapping between the initial and final states; the energy-restricted optimization facilitates the search for solitons by constraining the energy space. We concretely demonstrate this approach on the quasi-one-dimensional Bose-Einstein condensate with homogeneous and inhomogeneous nonlinearities. Our work offers an idea for data-driven effective modeling and studies of solitary waves in nonlinear physical systems.

7.
Nature ; 620(7972): 47-60, 2023 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-37532811

RESUMO

Artificial intelligence (AI) is being increasingly integrated into scientific discovery to augment and accelerate research, helping scientists to generate hypotheses, design experiments, collect and interpret large datasets, and gain insights that might not have been possible using traditional scientific methods alone. Here we examine breakthroughs over the past decade that include self-supervised learning, which allows models to be trained on vast amounts of unlabelled data, and geometric deep learning, which leverages knowledge about the structure of scientific data to enhance model accuracy and efficiency. Generative AI methods can create designs, such as small-molecule drugs and proteins, by analysing diverse data modalities, including images and sequences. We discuss how these methods can help scientists throughout the scientific process and the central issues that remain despite such advances. Both developers and users of AI toolsneed a better understanding of when such approaches need improvement, and challenges posed by poor data quality and stewardship remain. These issues cut across scientific disciplines and require developing foundational algorithmic approaches that can contribute to scientific understanding or acquire it autonomously, making them critical areas of focus for AI innovation.


Assuntos
Inteligência Artificial , Projetos de Pesquisa , Inteligência Artificial/normas , Inteligência Artificial/tendências , Conjuntos de Dados como Assunto , Aprendizado Profundo , Projetos de Pesquisa/normas , Projetos de Pesquisa/tendências , Aprendizado de Máquina não Supervisionado
9.
Sci Data ; 10(1): 549, 2023 08 22.
Artigo em Inglês | MEDLINE | ID: mdl-37607915

RESUMO

Molecular dynamics (MD) simulations have revolutionized the modeling of biomolecular conformations and provided unprecedented insight into molecular interactions. Due to the prohibitive computational overheads of ab initio simulation for large biomolecules, dynamic modeling for proteins is generally constrained on force field with molecular mechanics, which suffers from low accuracy as well as ignores the electronic effects. Here, we report AIMD-Chig, an MD dataset including 2 million conformations of 166-atom protein Chignolin sampled at the density functional theory (DFT) level with 7,763,146 CPU hours. 10,000 conformations were initialized covering the whole conformational space of Chignolin, including folded, unfolded, and metastable states. Ab initio simulations were driven by M06-2X/6-31 G* with a Berendsen thermostat at 340 K. We reported coordinates, energies, and forces for each conformation. AIMD-Chig brings the DFT level conformational space exploration from small organic molecules to real-world proteins. It can serve as the benchmark for developing machine learning potentials for proteins and facilitate the exploration of protein dynamics with ab initio accuracy.


Assuntos
Simulação de Dinâmica Molecular , Oligopeptídeos , Benchmarking , Aprendizado de Máquina , Conformação Molecular
10.
J Chem Phys ; 159(3)2023 Jul 21.
Artigo em Inglês | MEDLINE | ID: mdl-37458355

RESUMO

Machine learning force fields (MLFFs) have gained popularity in recent years as they provide a cost-effective alternative to ab initio molecular dynamics (MD) simulations. Despite a small error on the test set, MLFFs inherently suffer from generalization and robustness issues during MD simulations. To alleviate these issues, we propose global force metrics and fine-grained metrics from element and conformation aspects to systematically measure MLFFs for every atom and every conformation of molecules. We selected three state-of-the-art MLFFs (ET, NequIP, and ViSNet) and comprehensively evaluated on aspirin, Ac-Ala3-NHMe, and Chignolin MD datasets with the number of atoms ranging from 21 to 166. Driven by the trained MLFFs on these molecules, we performed MD simulations from different initial conformations, analyzed the relationship between the force metrics and the stability of simulation trajectories, and investigated the reason for collapsed simulations. Finally, the performance of MLFFs and the stability of MD simulations can be further improved guided by the proposed force metrics for model training, specifically training MLFF models with these force metrics as loss functions, fine-tuning by reweighting samples in the original dataset, and continued training by recruiting additional unexplored data.

11.
IEEE Trans Pattern Anal Mach Intell ; 45(10): 11407-11427, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37200120

RESUMO

Non-autoregressive (NAR) generation, which is first proposed in neural machine translation (NMT) to speed up inference, has attracted much attention in both machine learning and natural language processing communities. While NAR generation can significantly accelerate inference speed for machine translation, the speedup comes at the cost of sacrificed translation accuracy compared to its counterpart, autoregressive (AR) generation. In recent years, many new models and algorithms have been designed/proposed to bridge the accuracy gap between NAR generation and AR generation. In this paper, we conduct a systematic survey with comparisons and discussions of various non-autoregressive translation (NAT) models from different aspects. Specifically, we categorize the efforts of NAT into several groups, including data manipulation, modeling methods, training criterion, decoding algorithms, and the benefit from pre-trained models. Furthermore, we briefly review other applications of NAR models beyond machine translation, such as grammatical error correction, text summarization, text style transfer, dialogue, semantic parsing, automatic speech recognition, and so on. In addition, we also discuss potential directions for future exploration, including releasing the dependency of KD, reasonable training objectives, pre-training for NAR, and wider applications, etc. We hope this survey can help researchers capture the latest progress in NAR generation, inspire the design of advanced NAR models and algorithms, and enable industry practitioners to choose appropriate solutions for their applications.

12.
Brief Bioinform ; 24(1)2023 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-36592061

RESUMO

Drug-drug interaction (DDI) prediction identifies interactions of drug combinations in which the adverse side effects caused by the physicochemical incompatibility have attracted much attention. Previous studies usually model drug information from single or dual views of the whole drug molecules but ignore the detailed interactions among atoms, which leads to incomplete and noisy information and limits the accuracy of DDI prediction. In this work, we propose a novel dual-view drug representation learning network for DDI prediction ('DSN-DDI'), which employs local and global representation learning modules iteratively and learns drug substructures from the single drug ('intra-view') and the drug pair ('inter-view') simultaneously. Comprehensive evaluations demonstrate that DSN-DDI significantly improved performance on DDI prediction for the existing drugs by achieving a relatively improved accuracy of 13.01% and an over 99% accuracy under the transductive setting. More importantly, DSN-DDI achieves a relatively improved accuracy of 7.07% to unseen drugs and shows the usefulness for real-world DDI applications. Finally, DSN-DDI exhibits good transferability on synergistic drug combination prediction and thus can serve as a generalized framework in the drug discovery field.


Assuntos
Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Humanos , Interações Medicamentosas , Descoberta de Drogas , Biologia Computacional
13.
Brief Bioinform ; 24(1)2023 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-36573491

RESUMO

Precisely predicting the drug-drug interaction (DDI) is an important application and host research topic in drug discovery, especially for avoiding the adverse effect when using drug combination treatment for patients. Nowadays, machine learning and deep learning methods have achieved great success in DDI prediction. However, we notice that most of the works ignore the importance of the relation type when building the DDI prediction models. In this work, we propose a novel R$^2$-DDI framework, which introduces a relation-aware feature refinement module for drug representation learning. The relation feature is integrated into drug representation and refined in the framework. With the refinement features, we also incorporate the consistency training method to regularize the multi-branch predictions for better generalization. Through extensive experiments and studies, we demonstrate our R$^2$-DDI approach can significantly improve the DDI prediction performance over multiple real-world datasets and settings, and our method shows better generalization ability with the help of the feature refinement design.


Assuntos
Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Humanos , Interações Medicamentosas , Aprendizado de Máquina , Descoberta de Drogas
14.
IEEE Trans Pattern Anal Mach Intell ; 45(3): 3421-3433, 2023 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-35594229

RESUMO

In pixel-based reinforcement learning (RL), the states are raw video frames, which are mapped into hidden representation before feeding to a policy network. To improve sample efficiency of state representation learning, recently, the most prominent work is based on contrastive unsupervised representation. Witnessing that consecutive video frames in a game are highly correlated, to further improve data efficiency, we propose a new algorithm, i.e., masked contrastive representation learning for RL (M-CURL), which takes the correlation among consecutive inputs into consideration. In our architecture, besides a CNN encoder for hidden presentation of input state and a policy network for action selection, we introduce an auxiliary Transformer encoder module to leverage the correlations among video frames. During training, we randomly mask the features of several frames, and use the CNN encoder and Transformer to reconstruct them based on context frames. The CNN encoder and Transformer are jointly trained via contrastive learning where the reconstructed features should be similar to the ground-truth ones while dissimilar to others. During policy evaluation, the CNN encoder and the policy network are used to take actions, and the Transformer module is discarded. Our method achieves consistent improvements over CURL on 14 out of 16 environments from DMControl suite and 23 out of 26 environments from Atari 2600 Games. The code is available at https://github.com/teslacool/m-curl.

15.
J Phys Chem B ; 126(46): 9465-9475, 2022 11 24.
Artigo em Inglês | MEDLINE | ID: mdl-36345778

RESUMO

Markov state models (MSMs) play a key role in studying protein conformational dynamics. A sliding count window with a fixed lag time is widely used to sample sub-trajectories for transition counting and MSM construction. However, sub-trajectories sampled with a fixed lag time may not perform well under different selections of lag time, which requires strong prior practice and leads to less robust estimation. To alleviate it, we propose a novel stochastic method from a Poisson process to generate perturbative lag time for sub-trajectory sampling and utilize it to construct a Markov chain. Comprehensive evaluations on the double-well system, WW domain, BPTI, and RBD-ACE2 complex of SARS-CoV-2 reveal that our algorithm significantly increases the robustness and power of a constructed MSM without disturbing the Markovian properties. Furthermore, the superiority of our algorithm is amplified for slow dynamic modes in complex biological processes.


Assuntos
COVID-19 , SARS-CoV-2 , Humanos , Cadeias de Markov , Conformação Proteica , Algoritmos , Simulação de Dinâmica Molecular
16.
Bioinformatics ; 38(22): 5100-5107, 2022 11 15.
Artigo em Inglês | MEDLINE | ID: mdl-36205562

RESUMO

MOTIVATION: The interaction between drugs and targets (DTI) in human body plays a crucial role in biomedical science and applications. As millions of papers come out every year in the biomedical domain, automatically discovering DTI knowledge from biomedical literature, which are usually triplets about drugs, targets and their interaction, becomes an urgent demand in the industry. Existing methods of discovering biological knowledge are mainly extractive approaches that often require detailed annotations (e.g. all mentions of biological entities, relations between every two entity mentions, etc.). However, it is difficult and costly to obtain sufficient annotations due to the requirement of expert knowledge from biomedical domains. RESULTS: To overcome these difficulties, we explore an end-to-end solution for this task by using generative approaches. We regard the DTI triplets as a sequence and use a Transformer-based model to directly generate them without using the detailed annotations of entities and relations. Further, we propose a semi-supervised method, which leverages the aforementioned end-to-end model to filter unlabeled literature and label them. Experimental results show that our method significantly outperforms extractive baselines on DTI discovery. We also create a dataset, KD-DTI, to advance this task and release it to the community. AVAILABILITY AND IMPLEMENTATION: Our code and data are available at https://github.com/bert-nmt/BERT-DTI. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Publicações , Software , Humanos , Interações Medicamentosas
17.
Brief Bioinform ; 23(6)2022 11 19.
Artigo em Inglês | MEDLINE | ID: mdl-36136367

RESUMO

Well understanding protein function and structure in computational biology helps in the understanding of human beings. To face the limited proteins that are annotated structurally and functionally, the scientific community embraces the self-supervised pre-training methods from large amounts of unlabeled protein sequences for protein embedding learning. However, the protein is usually represented by individual amino acids with limited vocabulary size (e.g. 20 type proteins), without considering the strong local semantics existing in protein sequences. In this work, we propose a novel pre-training modeling approach SPRoBERTa. We first present an unsupervised protein tokenizer to learn protein representations with local fragment pattern. Then, a novel framework for deep pre-training model is introduced to learn protein embeddings. After pre-training, our method can be easily fine-tuned for different protein tasks, including amino acid-level prediction task (e.g. secondary structure prediction), amino acid pair-level prediction task (e.g. contact prediction) and also protein-level prediction task (remote homology prediction, protein function prediction). Experiments show that our approach achieves significant improvements in all tasks and outperforms the previous methods. We also provide detailed ablation studies and analysis for our protein tokenizer and training framework.


Assuntos
Biologia Computacional , Proteínas , Humanos , Proteínas/química , Biologia Computacional/métodos , Sequência de Aminoácidos , Estrutura Secundária de Proteína , Aminoácidos
18.
Brief Bioinform ; 23(6)2022 11 19.
Artigo em Inglês | MEDLINE | ID: mdl-36156661

RESUMO

Pre-trained language models have attracted increasing attention in the biomedical domain, inspired by their great success in the general natural language domain. Among the two main branches of pre-trained language models in the general language domain, i.e. BERT (and its variants) and GPT (and its variants), the first one has been extensively studied in the biomedical domain, such as BioBERT and PubMedBERT. While they have achieved great success on a variety of discriminative downstream biomedical tasks, the lack of generation ability constrains their application scope. In this paper, we propose BioGPT, a domain-specific generative Transformer language model pre-trained on large-scale biomedical literature. We evaluate BioGPT on six biomedical natural language processing tasks and demonstrate that our model outperforms previous models on most tasks. Especially, we get 44.98%, 38.42% and 40.76% F1 score on BC5CDR, KD-DTI and DDI end-to-end relation extraction tasks, respectively, and 78.2% accuracy on PubMedQA, creating a new record. Our case study on text generation further demonstrates the advantage of BioGPT on biomedical literature to generate fluent descriptions for biomedical terms.


Assuntos
Mineração de Dados , Processamento de Linguagem Natural
19.
Environ Sci Technol ; 56(14): 9903-9914, 2022 07 19.
Artigo em Inglês | MEDLINE | ID: mdl-35793558

RESUMO

Accurate timely estimation of emissions of nitrogen oxides (NOx) is a prerequisite for designing an effective strategy for reducing O3 and PM2.5 pollution. The satellite-based top-down method can provide near-real-time constraints on emissions; however, its efficiency is largely limited by efforts in dealing with the complex emission-concentration response. Here, we propose a novel machine-learning-based method using a physically informed variational autoencoder (VAE) emission predictor to infer NOx emissions from satellite-retrieved surface NO2 concentrations. The computational burden can be significantly reduced with the help of a neural network trained with a chemical transport model, allowing the VAE emission predictor to provide a timely estimation of posterior emissions based on the satellite-retrieved surface NO2 concentration. The VAE emission predictor successfully corrected the underestimation of NOx emissions in rural areas and the overestimation in urban areas, resulting in smaller normalized mean biases (reduced from -0.8 to -0.4) and larger R2 values (increased from 0.4 to 0.7). The interpretability of the VAE emission predictor was investigated using sensitivity analysis by modulating each feature, indicating that NO2 concentration and planetary boundary layer (PBL) height are important for estimating NOx emissions, which is consistent with our common knowledge. The advantages of the VAE emission predictor in efficiency, flexibility, and accuracy demonstrate its great potential in estimating the latest emissions and evaluating the control effectiveness from observations.


Assuntos
Poluentes Atmosféricos , Poluição do Ar , Poluentes Atmosféricos/análise , Poluição do Ar/análise , Redes Neurais de Computação , Óxido Nítrico/análise , Dióxido de Nitrogênio/análise , Óxidos de Nitrogênio/análise , Emissões de Veículos/análise
20.
Brief Bioinform ; 23(5)2022 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-35514186

RESUMO

The identification of active binding drugs for target proteins (referred to as drug-target interaction prediction) is the key challenge in virtual screening, which plays an essential role in drug discovery. Although recent deep learning-based approaches achieve better performance than molecular docking, existing models often neglect topological or spatial of intermolecular information, hindering prediction performance. We recognize this problem and propose a novel approach called the Intermolecular Graph Transformer (IGT) that employs a dedicated attention mechanism to model intermolecular information with a three-way Transformer-based architecture. IGT outperforms state-of-the-art (SoTA) approaches by 9.1% and 20.5% over the second best option for binding activity and binding pose prediction, respectively, and exhibits superior generalization ability to unseen receptor proteins than SoTA approaches. Furthermore, IGT exhibits promising drug screening ability against severe acute respiratory syndrome coronavirus 2 by identifying 83.1% active drugs that have been validated by wet-lab experiments with near-native predicted binding poses. Source code and datasets are available at https://github.com/microsoft/IGT-Intermolecular-Graph-Transformer.


Assuntos
Algoritmos , COVID-19 , Humanos , Simulação de Acoplamento Molecular , Proteínas/química , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA