Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Más filtros

Bases de datos
Tipo del documento
País de afiliación
Intervalo de año de publicación
1.
Brief Bioinform ; 25(4)2024 May 23.
Artículo en Inglés | MEDLINE | ID: mdl-39007592

RESUMEN

High-throughput DNA sequencing technologies decode tremendous amounts of microbial protein-coding gene sequences. However, accurately assigning protein functions to novel gene sequences remain a challenge. To this end, we developed FunGeneTyper, an extensible framework with two new deep learning models (i.e., FunTrans and FunRep), structured databases, and supporting resources for achieving highly accurate (Accuracy > 0.99, F1-score > 0.97) and fine-grained classification of antibiotic resistance genes (ARGs) and virulence factor genes. Using an experimentally confirmed dataset of ARGs comprising remote homologous sequences as the test set, our framework achieves by-far-the-best performance in the discovery of new ARGs from human gut (F1-score: 0.6948), wastewater (0.6072), and soil (0.5445) microbiomes, beating the state-of-the-art bioinformatics tools and sequence alignment-based (F1-score: 0.0556-0.5065) and domain-based (F1-score: 0.2630-0.5224) annotation approaches. Furthermore, our framework is implemented as a lightweight, privacy-preserving, and plug-and-play neural network module, facilitating its versatility and accessibility to developers and users worldwide. We anticipate widespread utilization of FunGeneTyper (https://github.com/emblab-westlake/FunGeneTyper) for precise classification of protein-coding gene functions and the discovery of numerous valuable enzymes. This advancement will have a significant impact on various fields, including microbiome research, biotechnology, metagenomics, and bioinformatics.


Asunto(s)
Aprendizaje Profundo , Humanos , Biología Computacional/métodos , Microbiota/genética , Proteínas Bacterianas/genética , Farmacorresistencia Microbiana/genética , Programas Informáticos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Factores de Virulencia/genética
2.
Artículo en Inglés | MEDLINE | ID: mdl-39302780

RESUMEN

Continual learning (CL) aims to learn new tasks without forgetting previous tasks. However, existing CL methods require a large amount of raw data, which is often unavailable due to copyright considerations and privacy risks. Instead, stakeholders usually release pre-trained machine learning models as a service (MLaaS), which users can access via APIs. This paper considers two practical-yet-novel CL settings: data-efficient CL (DECL-APIs) and data-free CL (DFCL-APIs), which achieve CL from a stream of APIs with partial or no raw data. Performing CL under these two new settings faces several challenges: unavailable full raw data, unknown model parameters, heterogeneous models of arbitrary architecture and scale, and catastrophic forgetting of previous APIs. To overcome these issues, we propose a novel data-free cooperative continual distillation learning framework that distills knowledge from a stream of APIs into a CL model by generating pseudo data, just by querying APIs. Specifically, our framework includes two cooperative generators and one CL model, forming their training as an adversarial game. We first use the CL model and the current API as fixed discriminators to train generators via a derivative-free method. Generators adversarially generate hard and diverse synthetic data to maximize the response gap between the CL model and the API. Next, we train the CL model by minimizing the gap between the responses of the CL model and the black-box API on synthetic data, to transfer the API's knowledge to the CL model. Furthermore, we propose a new regularization term based on network similarity to prevent catastrophic forgetting of previous APIs. Our method performs comparably to classic CL with full raw data on the MNIST and SVHN datasets in the DFCL-APIs setting. In the DECL-APIs setting, our method achieves 0.97×, 0.75× and 0.69× performance of classic CL on the more challenging CIFAR10, CIFAR100, and MiniImageNet, respectively.

3.
IEEE Trans Neural Netw Learn Syst ; 33(2): 694-706, 2022 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-33108294

RESUMEN

Negative sampling plays an important role in ranking-based recommender models. However, most existing sampling methods cannot generate informative item pairs with positive and negative instances due to two limitations: 1) they merely treat observed items as positive instances, ignoring the existence of potential positive items (i.e., nonobserved items users may prefer) and the probability of observed but noisy items and 2) they fail to capture the relationship between positive and negative items during negative sampling, which may cause the unexpected selection of potential positive items. In this article, we introduce a dynamic sampling strategy to search informative item pairs. Specifically, we first sample a positive instance from all the items by leveraging the overall features of user's observed items. Then, we strategically select a negative instance by considering its correlation with the sampled positive one. Formally, we propose an item pair generative adversarial network named IPGAN, where our sampling strategy is realized in two generative models for positive and negative instances, respectively. In addition, IPGAN can also ensure that the sampled item pairs are informative relative to the ground truth by a discriminative model. What is more, we propose a batch-training approach to further enhance both user and item modeling by alleviating the special bias (noise) from different users. This approach can also significantly accelerate the process of model training compared with classical GAN method for recommendation. Experimental results on three real data sets show that our approach outperforms other state-of-the-art approaches in terms of recommendation accuracy.

4.
J Healthc Eng ; 2017: 5967302, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-29118963

RESUMEN

Nowadays, providing high-quality recommendation services to users is an essential component in web applications, including shopping, making friends, and healthcare. This can be regarded either as a problem of estimating users' preference by exploiting explicit feedbacks (numerical ratings), or as a problem of collaborative ranking with implicit feedback (e.g., purchases, views, and clicks). Previous works for solving this issue include pointwise regression methods and pairwise ranking methods. The emerging healthcare websites and online medical databases impose a new challenge for medical service recommendation. In this paper, we develop a model, MBPR (Medical Bayesian Personalized Ranking over multiple users' actions), based on the simple observation that users tend to assign higher ranks to some kind of healthcare services that are meanwhile preferred in users' other actions. Experimental results on the real-world datasets demonstrate that MBPR achieves more accurate recommendations than several state-of-the-art methods and shows its generality and scalability via experiments on the datasets from one mobile shopping app.


Asunto(s)
Servicios de Salud , Modelos Estadísticos , Prioridad del Paciente , Algoritmos , Teorema de Bayes , Conjuntos de Datos como Asunto , Internet , Prioridad del Paciente/estadística & datos numéricos , Derivación y Consulta
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA