Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 59
Filtrar
Mais filtros

Base de dados
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
Brief Bioinform ; 23(6)2022 11 19.
Artigo em Inglês | MEDLINE | ID: mdl-36305457

RESUMO

With the development of research on the complex aetiology of many diseases, computational drug repositioning methodology has proven to be a shortcut to costly and inefficient traditional methods. Therefore, developing more promising computational methods is indispensable for finding new candidate diseases to treat with existing drugs. In this paper, a model integrating a new variant of message passing neural network and a novel-gated fusion mechanism called GLGMPNN is proposed for drug-disease association prediction. First, a light-gated message passing neural network (LGMPNN), including message passing, aggregation and updating, is proposed to separately extract multiple pieces of information from the similarity networks and the association network. Then, a gated fusion mechanism consisting of a forget gate and an output gate is applied to integrate the multiple pieces of information to extent. The forget gate calculated by the multiple embeddings is built to integrate the association information into the similarity information. Furthermore, the final node representations are controlled by the output gate, which fuses the topology information of the networks and the initial similarity information. Finally, a bilinear decoder is adopted to reconstruct an adjacency matrix for drug-disease associations. Evaluated by 10-fold cross-validations, GLGMPNN achieves excellent performance compared with the current models. The following studies show that our model can effectively discover novel drug-disease associations.


Assuntos
Biologia Computacional , Redes Neurais de Computação , Biologia Computacional/métodos , Reposicionamento de Medicamentos/métodos , Algoritmos
2.
BMC Bioinformatics ; 22(Suppl 3): 241, 2021 May 12.
Artigo em Inglês | MEDLINE | ID: mdl-33980147

RESUMO

BACKGROUND: In the development of science and technology, there are increasing evidences that there are some associations between lncRNAs and human diseases. Therefore, finding these associations between them will have a huge impact on our treatment and prevention of some diseases. However, the process of finding the associations between them is very difficult and requires a lot of time and effort. Therefore, it is particularly important to find some good methods for predicting lncRNA-disease associations (LDAs). RESULTS: In this paper, we propose a method based on dual sparse collaborative matrix factorization (DSCMF) to predict LDAs. The DSCMF method is improved on the traditional collaborative matrix factorization method. To increase the sparsity, the L2,1-norm is added in our method. At the same time, Gaussian interaction profile kernel is added to our method, which increase the network similarity between lncRNA and disease. Finally, the AUC value obtained by the experiment is used to evaluate the quality of our method, and the AUC value is obtained by the ten-fold cross-validation method. CONCLUSIONS: The AUC value obtained by the DSCMF method is 0.8523. At the end of the paper, simulation experiment is carried out, and the experimental results of prostate cancer, breast cancer, ovarian cancer and colorectal cancer are analyzed in detail. The DSCMF method is expected to bring some help to lncRNA-disease associations research. The code can access the https://github.com/Ming-0113/DSCMF website.


Assuntos
Neoplasias da Mama , Neoplasias da Próstata , RNA Longo não Codificante , Algoritmos , Simulação por Computador , Humanos , Masculino , Neoplasias da Próstata/genética , RNA Longo não Codificante/genética
3.
BMC Bioinformatics ; 21(1): 445, 2020 Oct 07.
Artigo em Inglês | MEDLINE | ID: mdl-33028187

RESUMO

BACKGROUND: As a machine learning method with high performance and excellent generalization ability, extreme learning machine (ELM) is gaining popularity in various studies. Various ELM-based methods for different fields have been proposed. However, the robustness to noise and outliers is always the main problem affecting the performance of ELM. RESULTS: In this paper, an integrated method named correntropy induced loss based sparse robust graph regularized extreme learning machine (CSRGELM) is proposed. The introduction of correntropy induced loss improves the robustness of ELM and weakens the negative effects of noise and outliers. By using the L2,1-norm to constrain the output weight matrix, we tend to obtain a sparse output weight matrix to construct a simpler single hidden layer feedforward neural network model. By introducing the graph regularization to preserve the local structural information of the data, the classification performance of the new method is further improved. Besides, we design an iterative optimization method based on the idea of half quadratic optimization to solve the non-convex problem of CSRGELM. CONCLUSIONS: The classification results on the benchmark dataset show that CSRGELM can obtain better classification results compared with other methods. More importantly, we also apply the new method to the classification problems of cancer samples and get a good classification effect.


Assuntos
Aprendizado de Máquina , Neoplasias/classificação , Benchmarking , Biologia Computacional/métodos , Bases de Dados Factuais , Humanos , Neoplasias/patologia
4.
BMC Bioinformatics ; 21(1): 454, 2020 Oct 14.
Artigo em Inglês | MEDLINE | ID: mdl-33054708

RESUMO

BACKGROUND: MicroRNAs (miRNAs) are non-coding RNAs with regulatory functions. Many studies have shown that miRNAs are closely associated with human diseases. Among the methods to explore the relationship between the miRNA and the disease, traditional methods are time-consuming and the accuracy needs to be improved. In view of the shortcoming of previous models, a method, collaborative matrix factorization based on matrix completion (MCCMF) is proposed to predict the unknown miRNA-disease associations. RESULTS: The complete matrix of the miRNA and the disease is obtained by matrix completion. Moreover, Gaussian Interaction Profile kernel is added to the miRNA functional similarity matrix and the disease semantic similarity matrix. Then the Weight K Nearest Known Neighbors method is used to pretreat the association matrix, so the model is close to the reality. Finally, collaborative matrix factorization method is applied to obtain the prediction results. Therefore, the MCCMF obtains a satisfactory result in the fivefold cross-validation, with an AUC of 0.9569 (0.0005). CONCLUSIONS: The AUC value of MCCMF is higher than other advanced methods in the fivefold cross validation experiment. In order to comprehensively evaluate the performance of MCCMF, accuracy, precision, recall and f-measure are also added. The final experimental results demonstrate that MCCMF outperforms other methods in predicting miRNA-disease associations. In the end, the effectiveness and practicability of MCCMF are further verified by researching three specific diseases.


Assuntos
Algoritmos , Predisposição Genética para Doença , MicroRNAs/genética , Área Sob a Curva , Redes Reguladoras de Genes , Hepatoblastoma/genética , Humanos , Curva ROC , Reprodutibilidade dos Testes , Retinoblastoma/genética , Fatores de Risco
5.
Hum Genomics ; 13(Suppl 1): 46, 2019 10 22.
Artigo em Inglês | MEDLINE | ID: mdl-31639067

RESUMO

BACKGROUND: As one of the most popular data representation methods, non-negative matrix decomposition (NMF) has been widely concerned in the tasks of clustering and feature selection. However, most of the previously proposed NMF-based methods do not adequately explore the hidden geometrical structure in the data. At the same time, noise and outliers are inevitably present in the data. RESULTS: To alleviate these problems, we present a novel NMF framework named robust hypergraph regularized non-negative matrix factorization (RHNMF). In particular, the hypergraph Laplacian regularization is imposed to capture the geometric information of original data. Unlike graph Laplacian regularization which captures the relationship between pairwise sample points, it captures the high-order relationship among more sample points. Moreover, the robustness of the RHNMF is enhanced by using the L2,1-norm constraint when estimating the residual. This is because the L2,1-norm is insensitive to noise and outliers. CONCLUSIONS: Clustering and common abnormal expression gene (com-abnormal expression gene) selection are conducted to test the validity of the RHNMF model. Extensive experimental results on multi-view datasets reveal that our proposed model outperforms other state-of-the-art methods.


Assuntos
Algoritmos , Bases de Dados Genéticas , Regulação Neoplásica da Expressão Gênica , Análise por Conglomerados , Humanos , Neoplasias/genética
6.
Hum Hered ; 84(1): 47-58, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31466072

RESUMO

Principal component analysis (PCA) is a widely used method for evaluating low-dimensional data. Some variants of PCA have been proposed to improve the interpretation of the principal components (PCs). One of the most common methods is sparse PCA which aims at finding a sparse basis to improve the interpretability over the dense basis of PCA. However, the performances of these improved methods are still far from satisfactory because the data still contain redundant PCs. In this paper, a novel method called PCA based on graph Laplacian and double sparse constraints (GDSPCA) is proposed to improve the interpretation of the PCs and consider the internal geometry of the data. In detail, GDSPCA utilizes L2,1-norm and L1-norm regularization terms simultaneously to enforce the matrix to be sparse by filtering redundant and irrelative PCs, where the L2,1-norm regularization term can produce row sparsity, while the L1-norm regularization term can enforce element sparsity. This way, we can make a better interpretation of the new PCs in low-dimensional subspace. Meanwhile, the method of GDSPCA integrates graph Laplacian into PCA to explore the geometric structure hidden in the data. A simple and effective optimization solution is provided. Extensive experiments on multi-view biological data demonstrate the feasibility and effectiveness of the proposed approach.


Assuntos
Algoritmos , Análise de Componente Principal , Análise por Conglomerados , Regulação Neoplásica da Expressão Gênica , Humanos , Neoplasias/genética
7.
BMC Bioinformatics ; 20(Suppl 25): 686, 2019 Dec 24.
Artigo em Inglês | MEDLINE | ID: mdl-31874608

RESUMO

BACKGROUND: Predicting miRNA-disease associations (MDAs) is time-consuming and expensive. It is imminent to improve the accuracy of prediction results. So it is crucial to develop a novel computing technology to predict new MDAs. Although some existing methods can effectively predict novel MDAs, there are still some shortcomings. Especially when the disease matrix is processed, its sparsity is an important factor affecting the final results. RESULTS: A robust collaborative matrix factorization (RCMF) is proposed to predict novel MDAs. The L2,1-norm are introduced to our method to achieve the highest AUC value than other advanced methods. CONCLUSIONS: 5-fold cross validation is used to evaluate our method, and simulation experiments are used to predict novel associations on Gold Standard Dataset. Finally, our prediction accuracy is better than other existing advanced methods. Therefore, our approach is effective and feasible in predicting novel MDAs.


Assuntos
Algoritmos , Neoplasias Hepáticas/genética , MicroRNAs/metabolismo , Área Sob a Curva , Humanos , Neoplasias Hepáticas/patologia , Curva ROC
8.
BMC Bioinformatics ; 20(1): 5, 2019 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-30611214

RESUMO

BACKGROUND: Predicting drug-disease interactions (DDIs) is time-consuming and expensive. Improving the accuracy of prediction results is necessary, and it is crucial to develop a novel computing technology to predict new DDIs. The existing methods mostly use the construction of heterogeneous networks to predict new DDIs. However, the number of known interacting drug-disease pairs is small, so there will be many errors in this heterogeneous network that will interfere with the final results. RESULTS: A novel method, known as the dual-network L2,1-collaborative matrix factorization, is proposed to predict novel DDIs. The Gaussian interaction profile kernels and L2,1-norm are introduced in our method to achieve better results than other advanced methods. The network similarities of drugs and diseases with their chemical and semantic similarities are combined in this method. CONCLUSIONS: Cross validation is used to evaluate our method, and simulation experiments are used to predict new interactions using two different datasets. Finally, our prediction accuracy is better than other existing methods. This proves that our method is feasible and effective.


Assuntos
Algoritmos , Biologia Computacional/métodos , Doença , Interações Medicamentosas , Área Sob a Curva , Bases de Dados como Assunto , Humanos , Reprodutibilidade dos Testes , Semântica
9.
BMC Bioinformatics ; 20(1): 353, 2019 Jun 24.
Artigo em Inglês | MEDLINE | ID: mdl-31234797

RESUMO

BACKGROUND: Predicting meaningful miRNA-disease associations (MDAs) is costly. Therefore, an increasing number of researchers are beginning to focus on methods to predict potential MDAs. Thus, prediction methods with improved accuracy are under development. An efficient computational method is proposed to be crucial for predicting novel MDAs. For improved experimental productivity, large biological datasets are used by researchers. Although there are many effective and feasible methods to predict potential MDAs, the possibility remains that these methods are flawed. RESULTS: A simple and effective method, known as Nearest Profile-based Collaborative Matrix Factorization (NPCMF), is proposed to identify novel MDAs. The nearest profile is introduced to our method to achieve the highest AUC value compared with other advanced methods. For some miRNAs and diseases without any association, we use the nearest neighbour information to complete the prediction. CONCLUSIONS: To evaluate the performance of our method, five-fold cross-validation is used to calculate the AUC value. At the same time, three disease cases, gastric neoplasms, rectal neoplasms and colonic neoplasms, are used to predict novel MDAs on a gold-standard dataset. We predict the vast majority of known MDAs and some novel MDAs. Finally, the prediction accuracy of our method is determined to be better than that of other existing methods. Thus, the proposed prediction model can obtain reliable experimental results.


Assuntos
Neoplasias do Colo/genética , Biologia Computacional/métodos , MicroRNAs/genética , Neoplasias Retais/genética , Neoplasias Gástricas/genética , Algoritmos , Área Sob a Curva , Neoplasias do Colo/patologia , Bases de Dados Genéticas , Predisposição Genética para Doença , Humanos , MicroRNAs/metabolismo , Curva ROC , Neoplasias Retais/patologia , Neoplasias Gástricas/patologia
10.
BMC Bioinformatics ; 20(Suppl 8): 287, 2019 Jun 10.
Artigo em Inglês | MEDLINE | ID: mdl-31182006

RESUMO

BACKGROUND: Predicting drug-target interactions is time-consuming and expensive. It is important to present the accuracy of the calculation method. There are many algorithms to predict global interactions, some of which use drug-target networks for prediction (ie, a bipartite graph of bound drug pairs and targets known to interact). Although these algorithms can predict some drug-target interactions to some extent, there is little effect for some new drugs or targets that have no known interaction. RESULTS: Since the datasets are usually located at or near low-dimensional nonlinear manifolds, we propose an improved GRMF (graph regularized matrix factorization) method to learn these flow patterns in combination with the previous matrix-decomposition method. In addition, we use one of the pre-processing steps previously proposed to improve the accuracy of the prediction. CONCLUSIONS: Cross-validation is used to evaluate our method, and simulation experiments are used to predict new interactions. In most cases, our method is superior to other methods. Finally, some examples of new drugs and new targets are predicted by performing simulation experiments. And the improved GRMF method can better predict the remaining drug-target interactions.


Assuntos
Algoritmos , Interações Medicamentosas , Bases de Dados como Assunto , Humanos , Reprodutibilidade dos Testes
11.
IEEE J Biomed Health Inform ; 28(2): 1110-1121, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38055359

RESUMO

Accumulating evidence indicates that microRNAs (miRNAs) can control and coordinate various biological processes. Consequently, abnormal expressions of miRNAs have been linked to various complex diseases. Recognizable proof of miRNA-disease associations (MDAs) will contribute to the diagnosis and treatment of human diseases. Nevertheless, traditional experimental verification of MDAs is laborious and limited to small-scale. Therefore, it is necessary to develop reliable and effective computational methods to predict novel MDAs. In this work, a multi-kernel graph attention deep autoencoder (MGADAE) method is proposed to predict potential MDAs. In detail, MGADAE first employs the multiple kernel learning (MKL) algorithm to construct an integrated miRNA similarity and disease similarity, providing more biological information for further feature learning. Second, MGADAE combines the known MDAs, disease similarity, and miRNA similarity into a heterogeneous network, then learns the representations of miRNAs and diseases through graph convolution operation. After that, an attention mechanism is introduced into MGADAE to integrate the representations from multiple graph convolutional network (GCN) layers. Lastly, the integrated representations of miRNAs and diseases are input into the bilinear decoder to obtain the final predicted association scores. Corresponding experiments prove that the proposed method outperforms existing advanced approaches in MDA prediction. Furthermore, case studies related to two human cancers provide further confirmation of the reliability of MGADAE in practice.


Assuntos
MicroRNAs , Neoplasias , Humanos , MicroRNAs/genética , Reprodutibilidade dos Testes , Biologia Computacional/métodos , Neoplasias/genética , Algoritmos
12.
IEEE J Biomed Health Inform ; 28(5): 3178-3185, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38408006

RESUMO

CircRNA has been proved to play an important role in the diseases diagnosis and treatment. Considering that the wet-lab is time-consuming and expensive, computational methods are viable alternative in these years. However, the number of circRNA-disease associations (CDAs) that can be verified is relatively few, and some methods do not take full advantage of dependencies between attributes. To solve these problems, this paper proposes a novel method based on Kernel Fusion and Deep Auto-encoder (KFDAE) to predict the potential associations between circRNAs and diseases. Firstly, KFDAE uses a non-linear method to fuse the circRNA similarity kernels and disease similarity kernels. Then the vectors are connected to make the positive and negative sample sets, and these data are send to deep auto-encoder to reduce dimension and extract features. Finally, three-layer deep feedforward neural network is used to learn features and gain the prediction score. The experimental results show that compared with existing methods, KFDAE achieves the best performance. In addition, the results of case studies prove the effectiveness and practical significance of KFDAE, which means KFDAE is able to capture more comprehensive information and generate credible candidate for subsequent wet-lab.


Assuntos
Algoritmos , Biologia Computacional , Redes Neurais de Computação , RNA Circular , Humanos , RNA Circular/genética , Biologia Computacional/métodos , Aprendizado Profundo
13.
IEEE J Biomed Health Inform ; 28(5): 3029-3041, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38427553

RESUMO

The roles of brain region activities and genotypic functions in the pathogenesis of Alzheimer's disease (AD) remain unclear. Meanwhile, current imaging genetics methods are difficult to identify potential pathogenetic markers by correlation analysis between brain network and genetic variation. To discover disease-related brain connectome from the specific brain structure and the fine-grained level, based on the Automated Anatomical Labeling (AAL) and human Brainnetome atlases, the functional brain network is first constructed for each subject. Specifically, the upper triangle elements of the functional connectivity matrix are extracted as connectivity features. The clustering coefficient and the average weighted node degree are developed to assess the significance of every brain area. Since the constructed brain network and genetic data are characterized by non-linearity, high-dimensionality, and few subjects, the deep subspace clustering algorithm is proposed to reconstruct the original data. Our multilayer neural network helps capture the non-linear manifolds, and subspace clustering learns pairwise affinities between samples. Moreover, most approaches in neuroimaging genetics are unsupervised learning, neglecting the diagnostic information related to diseases. We presented a label constraint with diagnostic status to instruct the imaging genetics correlation analysis. To this end, a diagnosis-guided deep subspace clustering association (DDSCA) method is developed to discover brain connectome and risk genetic factors by integrating genotypes with functional network phenotypes. Extensive experiments prove that DDSCA achieves superior performance to most association methods and effectively selects disease-relevant genetic markers and brain connectome at the coarse-grained and fine-grained levels.


Assuntos
Doença de Alzheimer , Encéfalo , Imageamento por Ressonância Magnética , Humanos , Doença de Alzheimer/genética , Doença de Alzheimer/diagnóstico por imagem , Análise por Conglomerados , Encéfalo/diagnóstico por imagem , Imageamento por Ressonância Magnética/métodos , Conectoma/métodos , Algoritmos , Idoso , Biomarcadores , Feminino , Masculino , Atlas como Assunto , Neuroimagem/métodos
14.
Comput Biol Chem ; 103: 107833, 2023 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-36812824

RESUMO

Many experiments have proved that long non-coding RNAs (lncRNAs) in humans have been implicated in disease development. The prediction of lncRNA-disease association is essential in promoting disease treatment and drug development. It is time-consuming and laborious to explore the relationship between lncRNA and diseases in the laboratory. The computation-based approach has clear advantages and has become a promising research direction. This paper proposes a new lncRNA disease association prediction algorithm BRWMC. Firstly, BRWMC constructed several lncRNA (disease) similarity networks based on different measurement angles and fused them into an integrated similarity network by similarity network fusion (SNF). In addition, the random walk method is used to preprocess the known lncRNA-disease association matrix and calculate the estimated scores of potential lncRNA-disease associations. Finally, the matrix completion method accurately predicts the potential lncRNA-disease associations. Under the framework of leave-one-out cross-validation and 5-fold cross-validation, the AUC values obtained by BRWMC are 0.9610 and 0.9739, respectively. In addition, case studies of three common diseases show that BRWMC is a reliable method for prediction.


Assuntos
RNA Longo não Codificante , Humanos , RNA Longo não Codificante/genética , Biologia Computacional/métodos , Algoritmos
15.
J Comput Biol ; 30(8): 937-947, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37486669

RESUMO

Determining the association between drug and disease is important in drug development. However, existing approaches for drug-disease associations (DDAs) prediction are too homogeneous in terms of feature extraction. Here, a novel graph representation approach based on light gradient boosting machine (GRLGB) is proposed for prediction of DDAs. After the introduction of the protein into a heterogeneous network, nodes features were extracted from two perspectives: network topology and biological knowledge. Finally, the GRLGB classifier was applied to predict potential DDAs. GRLGB achieved satisfactory results on Bdataset and Fdataset through 10-fold cross-validation. To further prove the reliability of the GRLGB, case studies involving anxiety disorders and clozapine were conducted. The results suggest that GRLGB can identify novel DDAs.


Assuntos
Biologia Computacional , Proteínas , Reprodutibilidade dos Testes , Biologia Computacional/métodos , Algoritmos
16.
IEEE J Biomed Health Inform ; 27(7): 3686-3694, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-37163398

RESUMO

Identifying drug-disease associations (DDAs) is critical to the development of drugs. Traditional methods to determine DDAs are expensive and inefficient. Therefore, it is imperative to develop more accurate and effective methods for DDAs prediction. Most current DDAs prediction methods utilize original DDAs matrix directly. However, the original DDAs matrix is sparse, which greatly affects the prediction consequences. Hence, a prediction method based on multi-similarities graph convolutional autoencoder (MSGCA) is proposed for DDAs prediction. First, MSGCA integrates multiple drug similarities and disease similarities using centered kernel alignment-based multiple kernel learning (CKA-MKL) algorithm to form new drug similarity and disease similarity, respectively. Second, the new drug and disease similarities are improved by linear neighborhood, and the DDAs matrix is reconstructed by weighted K nearest neighbor profiles. Next, the reconstructed DDAs and the improved drug and disease similarities are integrated into a heterogeneous network. Finally, the graph convolutional autoencoder with attention mechanism is utilized to predict DDAs. Compared with extant methods, MSGCA shows superior results on three datasets. Furthermore, case studies further demonstrate the reliability of MSGCA.


Assuntos
Algoritmos , Humanos , Reprodutibilidade dos Testes
17.
Artigo em Inglês | MEDLINE | ID: mdl-37022835

RESUMO

Studies have revealed that microbes have an important effect on numerous physiological processes, and further research on the links between diseases and microbes is significant. Given that laboratory methods are expensive and not optimized, computational models are increasingly used for discovering disease-related microbes. Here, a new neighbor approach based on two-tier Bi-Random Walk is proposed for potential disease-related microbes, known as NTBiRW. In this method, the first step is to construct multiple microbe similarities and disease similarities. Then, three kinds of microbe/disease similarity are integrated through two-tier Bi-Random Walk to obtain the final integrated microbe/disease similarity network with different weights. Finally, Weighted K Nearest Known Neighbors (WKNKN) is used for prediction based on the final similarity network. In addition, leave-one-out cross-validation (LOOCV) and 5-fold cross-validation (5-fold CV) are applied for evaluating the performance of NTBiRW. Multiple evaluating indicators are taken to show the performance from multiple perspectives. And most of the evaluation index values of NTBiRW are better than those of the compared methods. Moreover, in case studies on atopic dermatitis and psoriasis, most of the first 10 candidates in the final result can be proven. This also demonstrates the capability of NTBiRW for discovering new associations. Therefore, this method can contribute to the discovery of disease-related microbes and thus offer new thoughts for further understanding the pathogenesis of diseases.

18.
Artigo em Inglês | MEDLINE | ID: mdl-35085090

RESUMO

An Increase in microbial activity is shown to be intimately connected with the pathogenesis of diseases. Considering the expense of traditional verification methods, researchers are working to develop high-efficiency methods for detecting potential disease-related microbes. In this article, a new prediction method, MSF-LRR, is established, which uses Low-Rank Representation (LRR) to perform multi-similarity information fusion to predict disease-related microbes. Considering that most existing methods only use one class of similarity, three classes of microbe and disease similarity are added. Then, LRR is used to obtain low-rank structural similarity information. Additionally, the method adaptively extracts the local low-rank structure of the data from a global perspective, to make the information used for the prediction more effective. Finally, a neighbor-based prediction method that utilizes the concept of collaborative filtering is applied to predict unknown microbe-disease pairs. As a result, the AUC value of MSF-LRR is superior to other existing algorithms under 5-fold cross-validation. Furthermore, in case studies, excluding originally known associations, 16 and 19 of the top 20 microbes associated with Bacterial Vaginosis and Irritable Bowel Syndrome, respectively, have been confirmed by the recent literature. In summary, MSF-LRR is a good predictor of potential microbe-disease associations and can contribute to drug discovery and biological research.


Assuntos
Algoritmos , Bactérias , Doença , Interações entre Hospedeiro e Microrganismos , Bactérias/patogenicidade
19.
IEEE/ACM Trans Comput Biol Bioinform ; 20(3): 1774-1782, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36251902

RESUMO

With the development of bioinformatics, the important role played by lncRNAs in various intractable diseases has aroused the interest of many experts. In recent studies, researchers have found that several human diseases are related to lncRANs. Moreover, it is very difficult and expensive to explore the unknown lncRNA-disease associations (LDAs), so only a few associations have been confirmed. It is vital to find a more accurate and effective method to identify potential LDAs. In this study, a method of collaborative matrix factorization based on correntropy (LDCMFC) is proposed for the identification of potential LDAs. To improve the robustness of the algorithm, the traditional minimization of the Euclidean distance is replaced with the maximized correntropy. In addition, the weighted K nearest known neighbor (WKNKN) method is used to rebuild the adjacency matrix. Finally, the performance of LDCMFC is tested by 5-fold cross-validation. Compared with other traditional methods, LDACMFC obtains a higher AUC of 0.8628. In different types of studies of three important cancer cases, most of the potentially relevant lncRNAs derived from the experiments have been validated in the databases. The final result shows that LDCMFC is a feasible method to predict LDAs.


Assuntos
RNA Longo não Codificante , Humanos , RNA Longo não Codificante/genética , Algoritmos , Biologia Computacional/métodos , Bases de Dados Factuais , Análise por Conglomerados
20.
Interdiscip Sci ; 15(1): 88-99, 2023 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-36335274

RESUMO

With the high-quality development of bioinformatics technology, miRNA-disease associations (MDAs) are gradually being uncovered. At present, convenient and efficient prediction methods, which solve the problem of resource-consuming in traditional wet experiments, need to be further put forward. In this study, a space projection model based on block matrix is presented for predicting MDAs (BMPMDA). Specifically, two block matrices are first composed of the known association matrix and similarity to increase comprehensiveness. For the integrity of information in the heterogeneous network, matrix completion (MC) is utilized to mine potential MDAs. Considering the neighborhood information of data points, linear neighborhood similarity (LNS) is regarded as a measure of similarity. Next, LNS is projected onto the corresponding completed association matrix to derive the projection score. Finally, the AUC and AUPR values for BMPMDA reach 0.9691 and 0.6231, respectively. Additionally, the majority of novel MDAs in three disease cases are identified in existing databases and literature. It suggests that BMPMDA can serve as a reliable prediction model for biological research.


Assuntos
MicroRNAs , Humanos , Algoritmos , Biologia Computacional/métodos , Previsões , Bases de Dados Factuais , Predisposição Genética para Doença
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA