Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 47
Filtrar
1.
Brief Bioinform ; 24(1)2023 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-36562719

RESUMO

BACKGROUND: Cell-penetrating peptides (CPPs) have received considerable attention as a means of transporting pharmacologically active molecules into living cells without damaging the cell membrane, and thus hold great promise as future therapeutics. Recently, several machine learning-based algorithms have been proposed for predicting CPPs. However, most existing predictive methods do not consider the agreement (disagreement) between similar (dissimilar) CPPs and depend heavily on expert knowledge-based handcrafted features. RESULTS: In this study, we present SiameseCPP, a novel deep learning framework for automated CPPs prediction. SiameseCPP learns discriminative representations of CPPs based on a well-pretrained model and a Siamese neural network consisting of a transformer and gated recurrent units. Contrastive learning is used for the first time to build a CPP predictive model. Comprehensive experiments demonstrate that our proposed SiameseCPP is superior to existing baseline models for predicting CPPs. Moreover, SiameseCPP also achieves good performance on other functional peptide datasets, exhibiting satisfactory generalization ability.


Assuntos
Peptídeos Penetradores de Células , Peptídeos Penetradores de Células/metabolismo , Algoritmos , Transporte Biológico , Redes Neurais de Computação , Aprendizado de Máquina
2.
Methods ; 229: 41-48, 2024 Jun 14.
Artigo em Inglês | MEDLINE | ID: mdl-38880433

RESUMO

Graph neural networks (GNNs) have gained significant attention in disease prediction where the latent embeddings of patients are modeled as nodes and the similarities among patients are represented through edges. The graph structure, which determines how information is aggregated and propagated, plays a crucial role in graph learning. Recent approaches typically create graphs based on patients' latent embeddings, which may not accurately reflect their real-world closeness. Our analysis reveals that raw data, such as demographic attributes and laboratory results, offers a wealth of information for assessing patient similarities and can serve as a compensatory measure for graphs constructed exclusively from latent embeddings. In this study, we first construct adaptive graphs from both latent representations and raw data respectively, and then merge these graphs via weighted summation. Given that the graphs may contain extraneous and noisy connections, we apply degree-sensitive edge pruning and kNN sparsification techniques to selectively sparsify and prune these edges. We conducted intensive experiments on two diagnostic prediction datasets, and the results demonstrate that our proposed method surpasses current state-of-the-art techniques.

3.
Brief Bioinform ; 23(2)2022 03 10.
Artigo em Inglês | MEDLINE | ID: mdl-35181793

RESUMO

Chromosome is composed of many distinct chromatin domains, referred to variably as topological domains or topologically associating domains (TADs). The domains are stable across different cell types and highly conserved across species, thus these chromatin domains have been considered as the basic units of chromosome folding and regarded as an important secondary structure in chromosome organization. However, the identification of TAD boundaries is still a great challenge due to the high cost and low resolution of Hi-C data or experiments. In this study, we propose a novel ensemble learning framework, termed as StackTADB, for predicting the boundaries of TADs. StackTADB integrates four base classifiers including Random Forest, Logistic Regression, K-NearestNeighbor and Support Vector Machine. From the analysis of a series of examinations on the data set in the previous study, it is concluded that StackTADB has optimal performance in six metrics, AUC, Accuracy, MCC, Precision, Recall and F1 score, and it is superior to the existing methods. In addition, the comparison of the performance of multiple features shows that Kmers-based features play an essential role in predicting TADs boundaries of fruit flies, and we also apply the SHapley Additive exPlanations (SHAP) framework to interpret the predictions of StackTADB to identify the reason why Kmers-based features are vital. The experimental results show that the subsequences matching the BEAF-32 motif play a crucial role in predicting the boundaries of TADs. The source code is freely available at https://github.com/HaoWuLab-Bioinformatics/StackTADB and the webserver of StackTADB is freely available at http://hwtad.sdu.edu.cn:8002/StackTADB.


Assuntos
Cromatina , Proteínas de Drosophila , Animais , Cromossomos , Proteínas de Ligação a DNA/genética , Drosophila/genética , Proteínas de Drosophila/genética , Proteínas do Olho/genética , Aprendizado de Máquina , Software
4.
Brief Bioinform ; 23(5)2022 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-35514205

RESUMO

BACKGROUND: Coronavirus disease 2019 (COVID-19) has spurred a boom in uncovering repurposable existing drugs. Drug repurposing is a strategy for identifying new uses for approved or investigational drugs that are outside the scope of the original medical indication. MOTIVATION: Current works of drug repurposing for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) are mostly limited to only focusing on chemical medicines, analysis of single drug targeting single SARS-CoV-2 protein, one-size-fits-all strategy using the same treatment (same drug) for different infected stages of SARS-CoV-2. To dilute these issues, we initially set the research focusing on herbal medicines. We then proposed a heterogeneous graph embedding method to signaled candidate repurposing herbs for each SARS-CoV-2 protein, and employed the variational graph convolutional network approach to recommend the precision herb combinations as the potential candidate treatments against the specific infected stage. METHOD: We initially employed the virtual screening method to construct the 'Herb-Compound' and 'Compound-Protein' docking graph based on 480 herbal medicines, 12,735 associated chemical compounds and 24 SARS-CoV-2 proteins. Sequentially, the 'Herb-Compound-Protein' heterogeneous network was constructed by means of the metapath-based embedding approach. We then proposed the heterogeneous-information-network-based graph embedding method to generate the candidate ranking lists of herbs that target structural, nonstructural and accessory SARS-CoV-2 proteins, individually. To obtain precision synthetic effective treatments forvarious COVID-19 infected stages, we employed the variational graph convolutional network method to generate candidate herb combinations as the recommended therapeutic therapies. RESULTS: There were 24 ranking lists, each containing top-10 herbs, targeting 24 SARS-CoV-2 proteins correspondingly, and 20 herb combinations were generated as the candidate-specific treatment to target the four infected stages. The code and supplementary materials are freely available at https://github.com/fanyang-AI/TCM-COVID19.


Assuntos
Tratamento Farmacológico da COVID-19 , Combinação de Medicamentos , Reposicionamento de Medicamentos/métodos , Drogas em Investigação , Humanos , SARS-CoV-2
5.
Brief Bioinform ; 22(4)2021 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-33169141

RESUMO

MOTIVATION: N7-methylguanosine (m7G) is an important epigenetic modification, playing an essential role in gene expression regulation. Therefore, accurate identification of m7G modifications will facilitate revealing and in-depth understanding their potential functional mechanisms. Although high-throughput experimental methods are capable of precisely locating m7G sites, they are still cost ineffective. Therefore, it's necessary to develop new methods to identify m7G sites. RESULTS: In this work, by using the iterative feature representation algorithm, we developed a machine learning based method, namely m7G-IFL, to identify m7G sites. To demonstrate its superiority, m7G-IFL was evaluated and compared with existing predictors. The results demonstrate that our predictor outperforms existing predictors in terms of accuracy for identifying m7G sites. By analyzing and comparing the features used in the predictors, we found that the positive and negative samples in our feature space were more separated than in existing feature space. This result demonstrates that our features extracted more discriminative information via the iterative feature learning process, and thus contributed to the predictive performance improvement.


Assuntos
Metilação de DNA , Epigênese Genética , Guanosina/análogos & derivados , Máquina de Vetores de Suporte , Guanosina/genética , Guanosina/metabolismo , Células HeLa , Células Hep G2 , Humanos
6.
Brief Bioinform ; 22(4)2021 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-33152766

RESUMO

Origins of replication sites (ORIs), which refers to the initiative locations of genomic DNA replication, play essential roles in DNA replication process. Detection of ORIs' distribution in genome scale is one of key steps to in-depth understanding their regulation mechanisms. In this study, we presented a novel machine learning-based approach called Stack-ORI encompassing 10 cell-specific prediction models for identifying ORIs from four different eukaryotic species (Homo sapiens, Mus musculus, Drosophila melanogaster and Arabidopsis thaliana). For each cell-specific model, we employed 12 feature encoding schemes that cover nucleic acid composition, position-specific and physicochemical properties information. The optimal feature set was identified from each encoding individually and developed their respective baseline models using the eXtreme Gradient Boosting (XGBoost) classifier. Subsequently, the predicted scores of 12 baseline models are integrated as a novel feature vector to train XGBoost and develop the final model. Extensive experimental results show that Stack-ORI achieves significantly better performance as compared with their baseline models on both training and independent datasets. Interestingly, Stack-ORI consistently outperforms existing predictor in all cell-specific models, not only on training but also on independent test. Moreover, our novel approach provides necessary interpretations that help understanding model success by leveraging the powerful SHapley Additive exPlanation algorithm, thus underlining the most important feature encoding schemes significant for predicting cell-specific ORIs.


Assuntos
Bases de Dados de Ácidos Nucleicos , Modelos Genéticos , Origem de Replicação , Máquina de Vetores de Suporte , Transcrição Gênica , Animais , Drosophila melanogaster , Humanos , Camundongos
7.
Brief Bioinform ; 22(6)2021 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-34117740

RESUMO

The prediction of peptide secondary structures is fundamentally important to reveal the functional mechanisms of peptides with potential applications as therapeutic molecules. In this study, we propose a multi-view deep learning method named Peptide Secondary Structure Prediction based on Multi-View Information, Restriction and Transfer learning (PSSP-MVIRT) for peptide secondary structure prediction. To sufficiently exploit discriminative information, we introduce a multi-view fusion strategy to integrate different information from multiple perspectives, including sequential information, evolutionary information and hidden state information, respectively, and generate a unified feature space. Moreover, we construct a hybrid network architecture of Convolutional Neural Network and Bi-directional Gated Recurrent Unit to extract global and local features of peptides. Furthermore, we utilize transfer learning to effectively alleviate the lack of training samples (peptides with experimentally validated structures). Comparative results on independent tests demonstrate that our proposed method significantly outperforms state-of-the-art methods. In particular, our method exhibits better performance at the segment level, suggesting the strong ability of our model in capturing local discriminative information. The case study also shows that our PSSP-MVIRT achieves promising and robust performance in the prediction of new peptide secondary structures. Importantly, we establish a webserver to implement the proposed method, which is currently accessible via http://server.malab.cn/PSSP-MVIRT. We expect it can be a useful tool for the researchers of interest, facilitating the wide use of our method.


Assuntos
Algoritmos , Biologia Computacional/métodos , Aprendizado Profundo , Modelos Moleculares , Peptídeos/química , Estrutura Secundária de Proteína , Bases de Dados de Proteínas , Reprodutibilidade dos Testes , Navegador
8.
Methods ; 198: 65-75, 2022 02.
Artigo em Inglês | MEDLINE | ID: mdl-34555529

RESUMO

Epistasis between single nucleotide polymorphisms (SNPs) plays an important role in elucidating the missing heritability of complex diseases. Diverse approaches have been invented for detecting SNP interactions, but they canonically neglect the important and useful connections between SNPs and other bio-molecules (i.e., miRNAs and lncRNAs). To comprehensively model these disease related molecules, a heterogeneous bio-molecular network based solution EpiHNet is introduced for high-order SNP interactions detection. EpiHNet firstly uses case/control data to construct an SNP statistical network, and meta-path based similarity on the heterogeneous network composed with SNPs, genes, lncRNAs, miRNAs and diseases to define another SNP relational network. The SNP relational network can explore and exploit different associations between molecules and diseases to complement the SNP statistical network and search the significantly associated SNPs. Next, EpiHNet integrates these two networks into a composite network, applies the modularity based clustering with fast search strategy to divide SNP nodes into different clusters. After that, it detects SNP interactions based on SNP combinations derived from each cluster. Synthetic experiments on diverse two-locus and three-locus disease models manifest that EpiHNet outperforms competitive baselines, even without the heterogeneous network. For real WTCCC breast cancer data, EpiHNet also demonstrates expressive results on detecting high-order SNP interactions.


Assuntos
Epistasia Genética , Estudo de Associação Genômica Ampla , Algoritmos , Estudos de Casos e Controles , Análise por Conglomerados , Estudo de Associação Genômica Ampla/métodos , Humanos , Polimorfismo de Nucleotídeo Único
9.
Methods ; 207: 65-73, 2022 11.
Artigo em Inglês | MEDLINE | ID: mdl-36122881

RESUMO

Abnormal co-occurrence medical visit behavior is a form of medical insurance fraud. Specifically, an organized gang of fraudsters hold multiple medical insurance cards and purchase similar drugs frequently at the same time and the same location in order to siphon off medical insurance funds. Conventional identification methods to identify such behaviors rely mainly on manual auditing, making it difficult to satisfy the needs of identifying the small number of fraudulent behaviors in the large-scale medical data. On the other hand, the existing single-view bi-clustering algorithms only consider the features of the time-location dimension while neglecting the similarities in prescriptions and neglecting the fact that fraudsters may belong to multiple gangs. Therefore, in this paper, we present a multi-view bi-clustering method for identifying abnormal co-occurrence medical visit behavioral patterns, which performs cluster analysis simultaneously on the large-scale, complex and diverse visiting record dimension and prescription dimension to identify bi-clusters with similar time-location features. The proposed method constructs a matrix view of patients and visit records as well as a matrix view of patients and prescriptions, while decomposing multiple data matrices into sparse row and column vectors to obtain a consistent patient population across views. Subsequently the proposed method identifies the corresponding abnormal co-occurrence medical visit behavior and may greatly facilitate the safe operations and the sustainability of medical insurance funds. The experimental results show that our proposed method leads to more efficient and more accurate identifications of abnormal co-occurrence medical visit behavior, demonstrating its high efficiency and effectiveness.


Assuntos
Algoritmos , Humanos , Análise por Conglomerados
10.
Bioinformatics ; 37(24): 4684-4693, 2021 12 11.
Artigo em Inglês | MEDLINE | ID: mdl-34323948

RESUMO

MOTIVATION: Anticancer peptides (ACPs) have recently emerged as effective anticancer drugs in cancer therapy. Machine learning-based predictors have been developed to identify ACPs and achieve satisfactory performance. However, existing methods suffer from experience-based feature engineering, which not only restricts the representation ability of the models to a certain extent but also lacks adaptivity for different data, limiting the further improvement of the predictive performance and impacting the robustness of the predictive models. To alleviate the above problems, we propose a novel deep-learning-based predictor named ACPred-LAF, in which we propose a novel multisense and multiscaled embedding algorithm to automatically learn and extract context sequential characteristics of ACPs. RESULTS: Through the feature comparative analysis, we demonstrate that our learnable and self-adaptive embedding features are better than hand-crafted features in capturing discriminative information, which can effectively benefit the performance improvement for ACP prediction. In addition, benchmarking comparison results demonstrate that our ACPred-LAF outperforms the state-of-the-art methods both on existing benchmark datasets and our newly constructed dataset. Furthermore, we also prove and validate the robustness of the model via the data interference experiment. To avoid potential evaluation bias, here, we construct a new ACP benchmark dataset named ACP-Mixed by integrating existing datasets. We expect our newly constructed dataset to be a golden standard benchmark dataset in this field. To facilitate the use of our model, we develop a web server as the implementation of ACPred-LAF. AVAILABILITY AND IMPLEMENTATION: Our proposed ACPred-LAF, newly constructed benchmark dataset ACP-Mixed are open source collaborative initiatives available in the GitHub repository (https://github.com/TearsWaiting/ACPred-LAF). Besides, a webserver as the implementation of ACPred-LAF that can be accessed via: http://server.malab.cn/ACPred-LAF. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Antineoplásicos , Biologia Computacional , Biologia Computacional/métodos , Peptídeos , Algoritmos , Aprendizado de Máquina
11.
Bioinformatics ; 37(24): 4603-4610, 2021 12 11.
Artigo em Inglês | MEDLINE | ID: mdl-34601568

RESUMO

MOTIVATION: DNA methylation plays an important role in epigenetic modification, the occurrence, and the development of diseases. Therefore, identification of DNA methylation sites is critical for better understanding and revealing their functional mechanisms. To date, several machine learning and deep learning methods have been developed for the prediction of different DNA methylation types. However, they still highly rely on manual features, which can largely limit the high-latent information extraction. Moreover, most of them are designed for one specific DNA methylation type, and therefore cannot predict multiple methylation sites in multiple species simultaneously. In this study, we propose iDNA-ABT, an advanced deep learning model that utilizes adaptive embedding based on Bidirectional Encoder Representations from Transformers (BERT) together with transductive information maximization (TIM). RESULTS: Benchmark results show that our proposed iDNA-ABT can automatically and adaptively learn the distinguishing features of biological sequences from multiple species, and thus perform significantly better than the state-of-the-art methods in predicting three different DNA methylation types. In addition, TIM loss is proven to be effective in dichotomous tasks via the comparison experiment. Furthermore, we verify that our features have strong adaptability and robustness to different species through comparison of adaptive embedding and six handcrafted feature encodings. Importantly, our model shows great generalization ability in different species, demonstrating that our model can adaptively capture the cross-species differences and improve the predictive performance. For the convenient use of our method, we further established an online webserver as the implementation of the proposed iDNA-ABT. AVAILABILITY AND IMPLEMENTATION: Our proposed iDNA-ABT and data are freely accessible via http://server.wei-group.net/iDNA_ABT and our source codes are available for downloading in the GitHub repository (https://github.com/YUYING07/iDNA_ABT). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Metilação de DNA , Aprendizado Profundo , Software , Aprendizado de Máquina , Epigênese Genética
12.
Anal Chem ; 93(16): 6481-6490, 2021 04 27.
Artigo em Inglês | MEDLINE | ID: mdl-33843206

RESUMO

The detectability of peptides is fundamentally important in shotgun proteomics experiments. At present, there are many computational methods to predict the detectability of peptides based on sequential composition or physicochemical properties, but they all have various shortcomings. Here, we present PepFormer, a novel end-to-end Siamese network coupled with a hybrid architecture of a Transformer and gated recurrent units that is able to predict the peptide detectability based on peptide sequences only. Specially, we, for the first time, use contrastive learning and construct a new loss function for model training, greatly improving the generalization ability of our predictive model. Comparative results demonstrate that our model performs significantly better than state-of-the-art methods on benchmark data sets in two species (Homo sapiens and Mus musculus). To make the model more interpretable, we further investigate the embedded representations of peptide sequences automatically learnt from our model, and the visualization results indicate that our model can efficiently capture high-latent discriminative information, improving the predictive performance. In addition, our model shows a strong ability of cross-species transfer learning and adaptability, demonstrating that it has great potential in robust prediction of peptides detectability on different species. The source code of our proposed method can be found via https://github.com/WLYLab/PepFormer.


Assuntos
Peptídeos , Proteômica , Animais , Humanos , Camundongos , Peptídeos/análise
13.
Glob Chang Biol ; 26(2): 931-943, 2020 02.
Artigo em Inglês | MEDLINE | ID: mdl-31554024

RESUMO

Nitrous oxide (N2 O) emissions from soil contribute to global warming and are in turn substantially affected by climate change. However, climate change impacts on N2 O production across terrestrial ecosystems remain poorly understood. Here, we synthesized 46 published studies of N2 O fluxes and relevant soil functional genes (SFGs, that is, archaeal amoA, bacterial amoA, nosZ, narG, nirK and nirS) to assess their responses to increased temperature, increased or decreased precipitation amounts, and prolonged drought (no change in total precipitation but increase in precipitation intervals) in terrestrial ecosystem (i.e. grasslands, forests, shrublands, tundra and croplands). Across the data set, temperature increased N2 O emissions by 33%. However, the effects were highly variable across biomes, with strongest temperature responses in shrublands, variable responses in forests and negative responses in tundra. The warming methods employed also influenced the effects of temperature on N2 O emissions (most effectively induced by open-top chambers). Whole-day or whole-year warming treatment significantly enhanced N2 O emissions, but daytime, nighttime or short-season warming did not have significant effects. Regardless of biome, treatment method and season, increased precipitation promoted N2 O emission by an average of 55%, while decreased precipitation suppressed N2 O emission by 31%, predominantly driven by changes in soil moisture. The effect size of precipitation changes on nirS and nosZ showed a U-shape relationship with soil moisture; further insight into biotic mechanisms underlying N2 O emission response to climate change remain limited by data availability, underlying a need for studies that report SFG. Our findings indicate that climate change substantially affects N2 O emission and highlights the urgent need to incorporate this strong feedback into most climate models for convincing projection of future climate change.


Assuntos
Mudança Climática , Ecossistema , Óxido Nitroso , Solo , Tundra
14.
Artigo em Inglês | MEDLINE | ID: mdl-37665697

RESUMO

Major depressive disorder (MDD) is the most common psychological disease. To improve the recognition accuracy of MDD, more and more machine learning methods have been proposed to mining EEG features, i.e. typical brain functional patterns and recognition methods that are closely related to depression using resting EEG signals. Most existing methods typically utilize threshold methods to filter weak connections in the brain functional connectivity network (BFCN) and construct quantitative statistical features of brain function to measure the BFCN. However, these thresholds may excessively remove weak connections with functional relevance, which is not conducive to discovering potential hidden patterns in weak connections. In addition, statistical features cannot describe the topological structure features and information network propagation patterns of the brain's different functional regions. To solve these problems, we propose a novel MDD recognition method based on a multi-granularity graph convolution network (MGGCN). On the one hand, this method applies multiple sets of different thresholds to build a multi-granularity functional neural network, which can remove noise while fully retaining valuable weak connections. On the other hand, this method utilizes graph neural network to learn the topological structure features and brain saliency patterns of changes between brain functional regions on the multi-granularity functional neural network. Experimental results on the benchmark datasets validate the superior performance and time complexity of MGGCN. The analysis shows that as the granularity increases, the connectivity defects in the right frontal(RF) and right temporal (RT) regions, left temporal(LT) and left posterior(LP) regions increase. The brain functional connections in these regions can serve as potential biomarkers for MDD recognition.


Assuntos
Transtorno Depressivo Maior , Humanos , Transtorno Depressivo Maior/diagnóstico , Imageamento por Ressonância Magnética/métodos , Vias Neurais , Encéfalo , Reconhecimento Psicológico
15.
Interdiscip Sci ; 2024 Mar 15.
Artigo em Inglês | MEDLINE | ID: mdl-38489147

RESUMO

Survival analysis, as a widely used method for analyzing and predicting the timing of event occurrence, plays a crucial role in the medicine field. Medical professionals utilize survival models to gain insight into the effects of patient covariates on the disease, and the correlation with the effectiveness of different treatment strategies. This knowledge is essential for the development of treatment plans and the enhancement of treatment approaches. Conventional survival models, such as the Cox proportional hazards model, require a significant amount of feature engineering or prior knowledge to facilitate personalized modeling. To address these limitations, we propose a novel residual-based self-attention deep neural network for survival modeling, called ResDeepSurv, which combines the benefits of neural networks and the Cox proportional hazards regression model. The model proposed in our study simulates the distribution of survival time and the correlation between covariates and outcomes, but does not impose strict assumptions on the basic distribution of survival data. This approach effectively accounts for both linear and nonlinear risk functions in survival data analysis. The performance of our model in analyzing survival data with various risk functions is on par with or even superior to that of other existing survival analysis methods. Furthermore, we validate the superior performance of our model in comparison to currently existing methods by evaluating multiple publicly available clinical datasets. Through this study, we prove the effectiveness of our proposed model in survival analysis, providing a promising alternative to traditional approaches. The application of deep learning techniques and the ability to capture complex relationships between covariates and survival outcomes without relying on extensive feature engineering make our model a valuable tool for personalized medicine and decision-making in clinical practice.

16.
IEEE J Biomed Health Inform ; 28(4): 2294-2303, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38598367

RESUMO

Medicine package recommendation aims to assist doctors in clinical decision-making by recommending appropriate packages of medicines for patients. Current methods model this task as a multi-label classification or sequence generation problem, focusing on learning relationships between individual medicines and other medical entities. However, these approaches uniformly overlook the interactions between medicine packages and other medical entities, potentially resulting in a lack of completeness in recommended medicine packages. Furthermore, medicine commonsense knowledge considered by current methods is notably limited, making it challenging to delve into the decision-making processes of doctors. To solve these problems, we propose DIAGNN, a Dual-level Interaction Aware heterogeneous Graph Neural Network for medicine package recommendation. Specifically, DIAGNN explicitly models interactions of medical entities within electronic health records(EHRs) at two levels, individual medicine and medicine package, leveraging a heterogeneous graph. A dual-level interaction aware graph convolutional network is utilized to capture semantic information in the medical heterogeneous graph. Additionally, we incorporate medication indications into the medical heterogeneous graph as medicine commonsense knowledge. Extensive experimental results on real-world datasets validate the effectiveness of the proposed method.


Assuntos
Tomada de Decisão Clínica , Registros Eletrônicos de Saúde , Humanos , Conhecimento , Redes Neurais de Computação , Semântica
17.
Artigo em Inglês | MEDLINE | ID: mdl-39028598

RESUMO

Federated learning aims to facilitate collaborative training among multiple clients with data heterogeneity in a privacy-preserving manner, which either generates the generalized model or develops personalized models. However, existing methods typically struggle to balance both directions, as optimizing one often leads to failure in another. To address the problem, this article presents a method named personalized federated learning via cross silo prototypical calibration (pFedCSPC) to enhance the consistency of knowledge of clients by calibrating features from heterogeneous spaces, which contributes to enhancing the collaboration effectiveness between clients. Specifically, pFedCSPC employs an adaptive aggregation method to offer personalized initial models to each client, enabling rapid adaptation to personalized tasks. Subsequently, pFedCSPC learns class representation patterns on clients by clustering, averages the representations within each cluster to form local prototypes, and aggregates them on the server to generate global prototypes. Meanwhile, pFedCSPC leverages global prototypes as knowledge to guide the learning of local representation, which is beneficial for mitigating the data imbalanced problem and preventing overfitting. Moreover, pFedCSPC has designed a cross-silo prototypical calibration (CSPC) module, which utilizes contrastive learning techniques to map heterogeneous features from different sources into a unified space. This can enhance the generalization ability of the global model. Experiments were conducted on four datasets in terms of performance comparison, ablation study, in-depth analysis, and case study, and the results verified that pFedCSPC achieves improvements in both global generalization and local personalization performance via calibrating cross-source features and strengthening collaboration effectiveness, respectively.

18.
Artigo em Inglês | MEDLINE | ID: mdl-38324430

RESUMO

Federated learning has recently been applied to recommendation systems to protect user privacy. In federated learning settings, recommendation systems can train recommendation models by collecting the intermediate parameters instead of the real user data, which greatly enhances user privacy. In addition, federated recommendation systems (FedRSs) can cooperate with other data platforms to improve recommendation performance while meeting the regulation and privacy constraints. However, FedRSs face many new challenges such as privacy, security, heterogeneity, and communication costs. While significant research has been conducted in these areas, gaps in the surveying literature still exist. In this article, we: 1) summarize some common privacy mechanisms used in FedRSs and discuss the advantages and limitations of each mechanism; 2) review several novel attacks and defenses against security; 3) summarize some approaches to address heterogeneity and communication costs problems; 4) introduce some realistic applications and public benchmark datasets for FedRSs; and 5) present some prospective research directions in the future. This article can guide researchers and practitioners understand the research progress in these areas.

19.
Diagnostics (Basel) ; 14(11)2024 May 31.
Artigo em Inglês | MEDLINE | ID: mdl-38893693

RESUMO

Background: Long COVID, characterized by a persistent symptom spectrum following SARS-CoV-2 infection, poses significant health, social, and economic challenges. This review aims to consolidate knowledge on its epidemiology, clinical features, and underlying mechanisms to guide global responses; Methods: We conducted a literature review, analyzing peer-reviewed articles and reports to gather comprehensive data on long COVID's epidemiology, symptomatology, and management approaches; Results: Our analysis revealed a wide array of long COVID symptoms and risk factors, with notable demographic variability. The current understanding of its pathophysiology suggests a multifactorial origin yet remains partially understood. Emerging diagnostic criteria and potential therapeutic strategies were identified, highlighting advancements in long COVID management; Conclusions: This review highlights the multifaceted nature of long COVID, revealing a broad spectrum of symptoms, diverse risk factors, and the complex interplay of physiological mechanisms underpinning the condition. Long COVID symptoms and disorders will continue to weigh on healthcare systems in years to come. Addressing long COVID requires a holistic management strategy that integrates clinical care, social support, and policy initiatives. The findings underscore the need for increased international cooperation in research and health planning to address the complex challenges of long COVID. There is a call for continued refinement of diagnostic and treatment modalities, emphasizing a multidisciplinary approach to manage the ongoing and evolving impacts of the condition.

20.
IEEE Trans Neural Netw Learn Syst ; 34(12): 9587-9603, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-35344498

RESUMO

In parallel with the rapid adoption of artificial intelligence (AI) empowered by advances in AI research, there has been growing awareness and concerns of data privacy. Recent significant developments in the data regulation landscape have prompted a seismic shift in interest toward privacy-preserving AI. This has contributed to the popularity of Federated Learning (FL), the leading paradigm for the training of machine learning models on data silos in a privacy-preserving manner. In this survey, we explore the domain of personalized FL (PFL) to address the fundamental challenges of FL on heterogeneous data, a universal characteristic inherent in all real-world datasets. We analyze the key motivations for PFL and present a unique taxonomy of PFL techniques categorized according to the key challenges and personalization strategies in PFL. We highlight their key ideas, challenges, opportunities, and envision promising future trajectories of research toward a new PFL architectural design, realistic PFL benchmarking, and trustworthy PFL approaches.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA