Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
1.
Brief Bioinform ; 24(1)2023 01 19.
Artículo en Inglés | MEDLINE | ID: mdl-36567255

RESUMEN

Underlying medical conditions, such as cancer, kidney disease and heart failure, are associated with a higher risk for severe COVID-19. Accurate classification of COVID-19 patients with underlying medical conditions is critical for personalized treatment decision and prognosis estimation. In this study, we propose an interpretable artificial intelligence model termed VDJMiner to mine the underlying medical conditions and predict the prognosis of COVID-19 patients according to their immune repertoires. In a cohort of more than 1400 COVID-19 patients, VDJMiner accurately identifies multiple underlying medical conditions, including cancers, chronic kidney disease, autoimmune disease, diabetes, congestive heart failure, coronary artery disease, asthma and chronic obstructive pulmonary disease, with an average area under the receiver operating characteristic curve (AUC) of 0.961. Meanwhile, in this same cohort, VDJMiner achieves an AUC of 0.922 in predicting severe COVID-19. Moreover, VDJMiner achieves an accuracy of 0.857 in predicting the response of COVID-19 patients to tocilizumab treatment on the leave-one-out test. Additionally, VDJMiner interpretively mines and scores V(D)J gene segments of the T-cell receptors that are associated with the disease. The identified associations between single-cell V(D)J gene segments and COVID-19 are highly consistent with previous studies. The source code of VDJMiner is publicly accessible at https://github.com/TencentAILabHealthcare/VDJMiner. The web server of VDJMiner is available at https://gene.ai.tencent.com/VDJMiner/.


Asunto(s)
Asma , COVID-19 , Humanos , Inteligencia Artificial , Curva ROC , Programas Informáticos
2.
Brief Bioinform ; 23(5)2022 09 20.
Artículo en Inglés | MEDLINE | ID: mdl-35901464

RESUMEN

MOTIVATION: The associations between biomarkers and human diseases play a key role in understanding complex pathology and developing targeted therapies. Wet lab experiments for biomarker discovery are costly, laborious and time-consuming. Computational prediction methods can be used to greatly expedite the identification of candidate biomarkers. RESULTS: Here, we present a novel computational model named GTGenie for predicting the biomarker-disease associations based on graph and text features. In GTGenie, a graph attention network is utilized to characterize diverse similarities of biomarkers and diseases from heterogeneous information resources. Meanwhile, a pretrained BERT-based model is applied to learn the text-based representation of biomarker-disease relation from biomedical literature. The captured graph and text features are then integrated in a bimodal fusion network to model the hybrid entity representation. Finally, inductive matrix completion is adopted to infer the missing entries for reconstructing relation matrix, with which the unknown biomarker-disease associations are predicted. Experimental results on HMDD, HMDAD and LncRNADisease data sets showed that GTGenie can obtain competitive prediction performance with other state-of-the-art methods. AVAILABILITY: The source code of GTGenie and the test data are available at: https://github.com/Wolverinerine/GTGenie.


Asunto(s)
Biología Computacional , Programas Informáticos , Biología Computacional/métodos , Humanos
3.
Bioinformatics ; 39(8)2023 08 01.
Artículo en Inglés | MEDLINE | ID: mdl-37527015

RESUMEN

MOTIVATION: The interactions between T-cell receptors (TCR) and peptide-major histocompatibility complex (pMHC) are essential for the adaptive immune system. However, identifying these interactions can be challenging due to the limited availability of experimental data, sequence data heterogeneity, and high experimental validation costs. RESULTS: To address this issue, we develop a novel computational framework, named MIX-TPI, to predict TCR-pMHC interactions using amino acid sequences and physicochemical properties. Based on convolutional neural networks, MIX-TPI incorporates sequence-based and physicochemical-based extractors to refine the representations of TCR-pMHC interactions. Each modality is projected into modality-invariant and modality-specific representations to capture the uniformity and diversities between different features. A self-attention fusion layer is then adopted to form the classification module. Experimental results demonstrate the effectiveness of MIX-TPI in comparison with other state-of-the-art methods. MIX-TPI also shows good generalization capability on mutual exclusive evaluation datasets and a paired TCR dataset. AVAILABILITY AND IMPLEMENTATION: The source code of MIX-TPI and the test data are available at: https://github.com/Wolverinerine/MIX-TPI.


Asunto(s)
Complejo Mayor de Histocompatibilidad , Péptidos , Péptidos/química , Receptores de Antígenos de Linfocitos T/genética , Secuencia de Aminoácidos , Programas Informáticos , Unión Proteica
4.
Brief Bioinform ; 22(3)2021 05 20.
Artículo en Inglés | MEDLINE | ID: mdl-32633319

RESUMEN

MOTIVATION: Identifying microRNAs that are associated with different diseases as biomarkers is a problem of great medical significance. Existing computational methods for uncovering such microRNA-diseases associations (MDAs) are mostly developed under the assumption that similar microRNAs tend to associate with similar diseases. Since such an assumption is not always valid, these methods may not always be applicable to all kinds of MDAs. Considering that the relationship between long noncoding RNA (lncRNA) and different diseases and the co-regulation relationships between the biological functions of lncRNA and microRNA have been established, we propose here a multiview multitask method to make use of the known lncRNA-microRNA interaction to predict MDAs on a large scale. The investigation is performed in the absence of complete information of microRNAs and any similarity measurement for it and to the best knowledge, the work represents the first ever attempt to discover MDAs based on lncRNA-microRNA interactions. RESULTS: In this paper, we propose to develop a deep learning model called MVMTMDA that can create a multiview representation of microRNAs. The model is trained based on an end-to-end multitasking approach to machine learning so that, based on it, missing data in the side information can be determined automatically. Experimental results show that the proposed model yields an average area under ROC curve of 0.8410+/-0.018, 0.8512+/-0.012 and 0.8521+/-0.008 when k is set to 2, 5 and 10, respectively. In addition, we also propose here a statistical approach to predicting lncRNA-disease associations based on these associations and the MDA discovered using MVMTMDA. AVAILABILITY: Python code and the datasets used in our studies are made available at https://github.com/yahuang1991polyu/MVMTMDA/.


Asunto(s)
Enfermedad/genética , Aprendizaje Automático , MicroARNs , Modelos Genéticos , ARN Largo no Codificante , Humanos , MicroARNs/genética , MicroARNs/metabolismo , Valor Predictivo de las Pruebas , ARN Largo no Codificante/genética , ARN Largo no Codificante/metabolismo
5.
BMC Genomics ; 22(Suppl 1): 916, 2022 Mar 16.
Artículo en Inglés | MEDLINE | ID: mdl-35296232

RESUMEN

BACKGROUND: Recent evidences have suggested that human microorganisms participate in important biological activities in the human body. The dysfunction of host-microbiota interactions could lead to complex human disorders. The knowledge on host-microbiota interactions can provide valuable insights into understanding the pathological mechanism of diseases. However, it is time-consuming and costly to identify the disorder-specific microbes from the biological "haystack" merely by routine wet-lab experiments. With the developments in next-generation sequencing and omics-based trials, it is imperative to develop computational prediction models for predicting microbe-disease associations on a large scale. RESULTS: Based on the known microbe-disease associations derived from the Human Microbe-Disease Association Database (HMDAD), the proposed model shows reliable performance with high values of the area under ROC curve (AUC) of 0.9456 and 0.8866 in leave-one-out cross validations and five-fold cross validations, respectively. In case studies of colorectal carcinoma, 80% out of the top-20 predicted microbes have been experimentally confirmed via published literatures. CONCLUSION: Based on the assumption that functionally similar microbes tend to share the similar interaction patterns with human diseases, we here propose a group based computational model of Bayesian disease-oriented ranking to prioritize the most potential microbes associating with various human diseases. Based on the sequence information of genes, two computational approaches (BLAST+ and MEGA 7) are leveraged to measure the microbe-microbe similarity from different perspectives. The disease-disease similarity is calculated by capturing the hierarchy information from the Medical Subject Headings (MeSH) data. The experimental results illustrate the accuracy and effectiveness of the proposed model. This work is expected to facilitate the characterization and identification of promising microbial biomarkers.


Asunto(s)
Algoritmos , Bacterias/clasificación , Biología Computacional , ARN Ribosómico 16S , Teorema de Bayes , Biología Computacional/métodos , Genes de ARNr , Humanos , ARN Ribosómico 16S/genética
6.
PLoS Comput Biol ; 13(3): e1005455, 2017 03.
Artículo en Inglés | MEDLINE | ID: mdl-28339468

RESUMEN

In the recent few years, an increasing number of studies have shown that microRNAs (miRNAs) play critical roles in many fundamental and important biological processes. As one of pathogenetic factors, the molecular mechanisms underlying human complex diseases still have not been completely understood from the perspective of miRNA. Predicting potential miRNA-disease associations makes important contributions to understanding the pathogenesis of diseases, developing new drugs, and formulating individualized diagnosis and treatment for diverse human complex diseases. Instead of only depending on expensive and time-consuming biological experiments, computational prediction models are effective by predicting potential miRNA-disease associations, prioritizing candidate miRNAs for the investigated diseases, and selecting those miRNAs with higher association probabilities for further experimental validation. In this study, Path-Based MiRNA-Disease Association (PBMDA) prediction model was proposed by integrating known human miRNA-disease associations, miRNA functional similarity, disease semantic similarity, and Gaussian interaction profile kernel similarity for miRNAs and diseases. This model constructed a heterogeneous graph consisting of three interlinked sub-graphs and further adopted depth-first search algorithm to infer potential miRNA-disease associations. As a result, PBMDA achieved reliable performance in the frameworks of both local and global LOOCV (AUCs of 0.8341 and 0.9169, respectively) and 5-fold cross validation (average AUC of 0.9172). In the cases studies of three important human diseases, 88% (Esophageal Neoplasms), 88% (Kidney Neoplasms) and 90% (Colon Neoplasms) of top-50 predicted miRNAs have been manually confirmed by previous experimental reports from literatures. Through the comparison performance between PBMDA and other previous models in case studies, the reliable performance also demonstrates that PBMDA could serve as a powerful computational tool to accelerate the identification of disease-miRNA associations.


Asunto(s)
Biomarcadores de Tumor/genética , Estudios de Asociación Genética , MicroARNs/genética , Modelos Estadísticos , Neoplasias/epidemiología , Neoplasias/genética , Simulación por Computador , Predisposición Genética a la Enfermedad/epidemiología , Predisposición Genética a la Enfermedad/genética , Humanos , Modelos Genéticos , Prevalencia , Pronóstico , Medición de Riesgo/métodos , Factores de Riesgo , Transducción de Señal/genética
7.
BMC Bioinformatics ; 18(1): 179, 2017 Mar 20.
Artículo en Inglés | MEDLINE | ID: mdl-28320326

RESUMEN

BACKGROUND: The rapid progress of high-throughput DNA sequencing techniques has dramatically reduced the costs of whole genome sequencing, which leads to revolutionary advances in gene industry. The explosively increasing volume of raw data outpaces the decreasing disk cost and the storage of huge sequencing data has become a bottleneck of downstream analyses. Data compression is considered as a solution to reduce the dependency on storage. Efficient sequencing data compression methods are highly demanded. RESULTS: In this article, we present a lossless reference-based compression method namely LW-FQZip 2 targeted at FASTQ files. LW-FQZip 2 is improved from LW-FQZip 1 by introducing more efficient coding scheme and parallelism. Particularly, LW-FQZip 2 is equipped with a light-weight mapping model, bitwise prediction by partial matching model, arithmetic coding, and multi-threading parallelism. LW-FQZip 2 is evaluated on both short-read and long-read data generated from various sequencing platforms. The experimental results show that LW-FQZip 2 is able to obtain promising compression ratios at reasonable time and memory space costs. CONCLUSIONS: The competence enables LW-FQZip 2 to serve as a candidate tool for archival or space-sensitive applications of high-throughput DNA sequencing data. LW-FQZip 2 is freely available at http://csse.szu.edu.cn/staff/zhuzx/LWFQZip2 and https://github.com/Zhuzxlab/LW-FQZip2 .


Asunto(s)
Compresión de Datos/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Alineación de Secuencia/métodos , Análisis de Secuencia de ADN/métodos
8.
J Transl Med ; 15(1): 209, 2017 10 16.
Artículo en Inglés | MEDLINE | ID: mdl-29037244

RESUMEN

BACKGROUND: Accumulating clinical researches have shown that specific microbes with abnormal levels are closely associated with the development of various human diseases. Knowledge of microbe-disease associations can provide valuable insights for complex disease mechanism understanding as well as the prevention, diagnosis and treatment of various diseases. However, little effort has been made to predict microbial candidates for human complex diseases on a large scale. METHODS: In this work, we developed a new computational model for predicting microbe-disease associations by combining two single recommendation methods. Based on the assumption that functionally similar microbes tend to get involved in the mechanism of similar disease, we adopted neighbor-based collaborative filtering and a graph-based scoring method to compute association possibility of microbe-disease pairs. The promising prediction performance could be attributed to the use of hybrid approach based on two single recommendation methods as well as the introduction of Gaussian kernel-based similarity and symptom-based disease similarity. RESULTS: To evaluate the performance of the proposed model, we implemented leave-one-out and fivefold cross validations on the HMDAD database, which is recently built as the first database collecting experimentally-confirmed microbe-disease associations. As a result, NGRHMDA achieved reliable results with AUCs of 0.9023 ± 0.0031 and 0.9111 in the validation frameworks of fivefold CV and LOOCV. In addition, 78.2% microbe samples and 66.7% disease samples are found to be consistent with the basic assumption of our work that microbes tend to get involved in the similar disease clusters, and vice versa. CONCLUSIONS: Compared with other methods, the prediction results yielded by NGRHMDA demonstrate its effective prediction performance for microbe-disease associations. It is anticipated that NGRHMDA can be used as a useful tool to search the most potential microbial candidates for various diseases, and therefore boosts the medical knowledge and drug development. The codes and dataset of our work can be downloaded from https://github.com/yahuang1991/NGRHMDA .


Asunto(s)
Algoritmos , Simulación por Computador , Interacciones Huésped-Patógeno , Humanos , Curva ROC , Reproducibilidad de los Resultados
9.
IEEE Trans Biomed Eng ; 70(4): 1137-1149, 2023 04.
Artículo en Inglés | MEDLINE | ID: mdl-36178988

RESUMEN

OBJECTIVE: Deep learning (DL) techniques have been introduced to assist doctors in the interpretation of medical images by detecting image-derived phenotype abnormality. Yet the privacy-preserving policy of medical images disables the effective training of DL model using sufficiently large datasets. As a decentralized computing paradigm to address this issue, federated learning (FL) allows the training process to occur in individual institutions with local datasets, and then aggregates the resultant weights without risk of privacy leakage. METHODS: We propose an effective federated multi-task learning (MTL) framework to jointly identify multiple related mental disorders based on functional magnetic resonance imaging data. A federated contrastive learning-based feature extractor is developed to extract high-level features across client models. To ease the optimization conflicts of updating shared parameters in MTL, we present a federated multi-gate mixture of expert classifier for the joint classification. The proposed framework also provides practical modules, including personalized model learning, privacy protection, and federated biomarker interpretation. RESULTS: On real-world datasets, the proposed framework achieves robust diagnosis accuracies of 69.48 ± 1.6%, 71.44 ± 3.2%, and 83.29 ± 3.2% in autism spectrum disorder, attention deficit/hyperactivity disorder, and schizophrenia, respectively. CONCLUSION: The proposed framework can effectively ease the domain shift between clients via federated MTL. SIGNIFICANCE: The current work provides insights into exploiting the advantageous knowledge shared in related mental disorders for improving the generalization capability of computer-aided detection approaches.


Asunto(s)
Trastorno del Espectro Autista , Trastornos Mentales , Humanos , Trastorno del Espectro Autista/diagnóstico por imagen , Trastornos Mentales/diagnóstico por imagen , Imagen por Resonancia Magnética
10.
Artículo en Inglés | MEDLINE | ID: mdl-37027556

RESUMEN

Neuroimaging techniques have been widely adopted to detect the neurological brain structures and functions of the nervous system. As an effective noninvasive neuroimaging technique, functional magnetic resonance imaging (fMRI) has been extensively used in computer-aided diagnosis (CAD) of mental disorders, e.g., autism spectrum disorder (ASD) and attention deficit/hyperactivity disorder (ADHD). In this study, we propose a spatial-temporal co-attention learning (STCAL) model for diagnosing ASD and ADHD from fMRI data. In particular, a guided co-attention (GCA) module is developed to model the intermodal interactions of spatial and temporal signal patterns. A novel sliding cluster attention module is designed to address global feature dependency of self-attention mechanism in fMRI time series. Comprehensive experimental results demonstrate that our STCAL model can achieve competitive accuracies of 73.0 ± 4.5%, 72.0 ± 3.8%, and 72.5 ± 4.2% on the ABIDE I, ABIDE II, and ADHD-200 datasets, respectively. Moreover, the potential for feature pruning based on the co-attention scores is validated by the simulation experiment. The clinical interpretation analysis of STCAL can allow medical professionals to concentrate on the discriminative regions of interest and key time frames from fMRI data.

11.
IEEE Trans Pattern Anal Mach Intell ; 45(11): 13778-13795, 2023 11.
Artículo en Inglés | MEDLINE | ID: mdl-37486851

RESUMEN

The high prevalence of mental disorders gradually poses a huge pressure on the public healthcare services. Deep learning-based computer-aided diagnosis (CAD) has emerged to relieve the tension in healthcare institutions by detecting abnormal neuroimaging-derived phenotypes. However, training deep learning models relies on sufficient annotated datasets, which can be costly and laborious. Semi-supervised learning (SSL) and transfer learning (TL) can mitigate this challenge by leveraging unlabeled data within the same institution and advantageous information from source domain, respectively. This work is the first attempt to propose an effective semi-supervised transfer learning (SSTL) framework dubbed S3TL for CAD of mental disorders on fMRI data. Within S3TL, a secure cross-domain feature alignment method is developed to generate target-related source model in SSL. Subsequently, we propose an enhanced dual-stage pseudo-labeling approach to assign pseudo-labels for unlabeled samples in target domain. Finally, an advantageous knowledge transfer method is conducted to improve the generalization capability of the target model. Comprehensive experimental results demonstrate that S3TL achieves competitive accuracies of 69.14%, 69.65%, and 72.62% on ABIDE-I, ABIDE-II, and ADHD-200 datasets, respectively. Furthermore, the simulation experiments also demonstrate the application potential of S3TL through model interpretation analysis and federated learning extension.


Asunto(s)
Imagen por Resonancia Magnética , Trastornos Mentales , Humanos , Algoritmos , Trastornos Mentales/diagnóstico por imagen , Neuroimagen , Aprendizaje Automático Supervisado
12.
Artículo en Inglés | MEDLINE | ID: mdl-36459608

RESUMEN

Facing the increasing worldwide prevalence of mental disorders, the symptom-based diagnostic criteria struggle to address the urgent public health concern due to the global shortfall in well-qualified professionals. Thanks to the recent advances in neuroimaging techniques, functional magnetic resonance imaging (fMRI) has surfaced as a new solution to characterize neuropathological biomarkers for detecting functional connectivity (FC) anomalies in mental disorders. However, the existing computer-aided diagnosis models for fMRI analysis suffer from unstable performance on large datasets. To address this issue, we propose an efficient multitask learning (MTL) framework for joint diagnosis of multiple mental disorders using resting-state fMRI data. A novel multiobjective evolutionary clustering algorithm is presented to group regions of interests (ROIs) into different clusters for FC pattern analysis. On the optimal clustering solution, the multicluster multigate mixture-of-expert model is used for the final classification by capturing the highly consistent feature patterns among related diagnostic tasks. Extensive simulation experiments demonstrate that the performance of the proposed framework is superior to that of the other state-of-the-art methods. Moreover, the potential for practical application of the framework is also validated in terms of limited computational resources, real-time analysis, and insufficient training data. The proposed model can identify the remarkable interpretative biomarkers associated with specific mental disorders for clinical interpretation analysis.

13.
Artículo en Inglés | MEDLINE | ID: mdl-36374900

RESUMEN

The globally rising prevalence of mental disorders leads to shortfalls in timely diagnosis and therapy to reduce patients' suffering. Facing such an urgent public health problem, professional efforts based on symptom criteria are seriously overstretched. Recently, the successful applications of computer-aided diagnosis approaches have provided timely opportunities to relieve the tension in healthcare services. Particularly, multimodal representation learning gains increasing attention thanks to the high temporal and spatial resolution information extracted from neuroimaging fusion. In this work, we propose an efficient multimodality fusion framework to identify multiple mental disorders based on the combination of functional and structural magnetic resonance imaging. A multioutput conditional generative adversarial network (GAN) is developed to address the scarcity of multimodal data for augmentation. Based on the augmented training data, the multiheaded gating fusion model is proposed for classification by extracting the complementary features across different modalities. The experiments demonstrate that the proposed model can achieve robust accuracies of 75.1 ± 1.5%, 72.9 ± 1.1%, and 87.2 ± 1.5% for autism spectrum disorder (ASD), attention deficit/hyperactivity disorder, and schizophrenia, respectively. In addition, the interpretability of our model is expected to enable the identification of remarkable neuropathology diagnostic biomarkers, leading to well-informed therapeutic decisions.

14.
IEEE Trans Neural Netw Learn Syst ; 32(7): 2847-2861, 2021 07.
Artículo en Inglés | MEDLINE | ID: mdl-32692687

RESUMEN

With the increasing prevalence of autism spectrum disorder (ASD), it is important to identify ASD patients for effective treatment and intervention, especially in early childhood. Neuroimaging techniques have been used to characterize the complex biomarkers based on the functional connectivity anomalies in the ASD. However, the diagnosis of ASD still adopts the symptom-based criteria by clinical observation. The existing computational models tend to achieve unreliable diagnostic classification on the large-scale aggregated data sets. In this work, we propose a novel graph-based classification model using the deep belief network (DBN) and the Autism Brain Imaging Data Exchange (ABIDE) database, which is a worldwide multisite functional and structural brain imaging data aggregation. The remarkable connectivity features are selected through a graph extension of K -nearest neighbors and then refined by a restricted path-based depth-first search algorithm. Thanks to the feature reduction, lower computational complexity could contribute to the shortening of the training time. The automatic hyperparameter-tuning technique is introduced to optimize the hyperparameters of the DBN by exploring the potential parameter space. The simulation experiments demonstrate the superior performance of our model, which is 6.4% higher than the best result reported on the ABIDE database. We also propose to use the data augmentation and the oversampling technique to identify further the possible subtypes within the ASD. The interpretability of our model enables the identification of the most remarkable autistic neural correlation patterns from the data-driven outcomes.


Asunto(s)
Trastorno del Espectro Autista/diagnóstico por imagen , Interfaces Cerebro-Computador , Imagen por Resonancia Magnética/métodos , Algoritmos , Trastorno del Espectro Autista/clasificación , Mapeo Encefálico , Simulación por Computador , Bases de Datos Factuales , Aprendizaje Profundo , Humanos , Redes Neurales de la Computación , Neuroimagen
15.
IEEE Trans Neural Netw Learn Syst ; 32(9): 3971-3984, 2021 09.
Artículo en Inglés | MEDLINE | ID: mdl-32841125

RESUMEN

As a group of complex neurodevelopmental disorders, autism spectrum disorder (ASD) has been reported to have a high overall prevalence, showing an unprecedented spurt since 2000. Due to the unclear pathomechanism of ASD, it is challenging to diagnose individuals with ASD merely based on clinical observations. Without additional support of biochemical markers, the difficulty of diagnosis could impact therapeutic decisions and, therefore, lead to delayed treatments. Recently, accumulating evidence have shown that both genetic abnormalities and chemical toxicants play important roles in the onset of ASD. In this work, a new multilabel classification (MLC) model is proposed to identify the autistic risk genes and toxic chemicals on a large-scale data set. We first construct the feature matrices and partially labeled networks for autistic risk genes and toxic chemicals from multiple heterogeneous biological databases. Based on both global and local measure metrics, the simulation experiments demonstrate that the proposed model achieves superior classification performance in comparison with the other state-of-the-art MLC methods. Through manual validation with existing studies, 60% and 50% out of the top-20 predicted risk genes are confirmed to have associations with ASD and autistic disorder, respectively. To the best of our knowledge, this is the first computational tool to identify ASD-related risk genes and toxic chemicals, which could lead to better therapeutic decisions of ASD.


Asunto(s)
Trastorno del Espectro Autista/inducido químicamente , Trastorno del Espectro Autista/genética , Trastorno Autístico/inducido químicamente , Trastorno Autístico/genética , Sustancias Peligrosas/clasificación , Sustancias Peligrosas/toxicidad , Aprendizaje Automático , Algoritmos , Biomarcadores , Simulación por Computador , Bases de Datos Genéticas , Interacción Gen-Ambiente , Humanos , Redes Neurales de la Computación , Medición de Riesgo
16.
Front Genet ; 10: 758, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31555320

RESUMEN

The interaction of miRNA and lncRNA is known to be important for gene regulations. However, the number of known lncRNA-miRNA interactions is still very limited and there are limited computational tools available for predicting new ones. Considering that lncRNAs and miRNAs share internal patterns in the partnership between each other, the underlying lncRNA-miRNA interactions could be predicted by utilizing the known ones, which could be considered as a semi-supervised learning problem. It is shown that the attributes of lncRNA and miRNA have a close relationship with the interaction between each other. Effective use of side information could be helpful for improving the performance especially when the training samples are limited. In view of this, we proposed an end-to-end prediction model called GCLMI (Graph Convolution for novel lncRNA-miRNA Interactions) by combining the techniques of graph convolution and auto-encoder. Without any preprocessing process on the feature information, our method can incorporate raw data of node attributes with the topology of the interaction network. Based on a real dataset collected from a public database, the results of experiments conducted on k-fold cross validations illustrate the robustness and effectiveness of the prediction performance of the proposed prediction model. We prove the graph convolution layer as designed in the proposed model able to effectively integrate the input data by filtering the graph with node features. The proposed model is anticipated to yield highly potential lncRNA-miRNA interactions in the scenario that different types of numerical features describing lncRNA or miRNA are provided by users, serving as a useful computational tool.

17.
Brief Funct Genomics ; 18(1): 58-82, 2019 02 14.
Artículo en Inglés | MEDLINE | ID: mdl-30247501

RESUMEN

From transcriptional noise to dark matter of biology, the rapidly changing view of long non-coding RNA (lncRNA) leads to deep understanding of human complex diseases induced by abnormal expression of lncRNAs. There is urgent need to discern potential functional roles of lncRNAs for further study of pathology, diagnosis, therapy, prognosis, prevention of human complex disease and disease biomarker detection at lncRNA level. Computational models are anticipated to be an effective way to combine current related databases for predicting most potential lncRNA functions and calculating lncRNA functional similarity on the large scale. In this review, we firstly illustrated the biological function of lncRNAs from five biological processes and briefly depicted the relationship between mutations or dysfunctions of lncRNAs and human complex diseases involving cancers, nervous system disorders and others. Then, 17 publicly available lncRNA function-related databases containing four types of functional information content were introduced. Based on these databases, dozens of developed computational models are emerging to help characterize the functional roles of lncRNAs. We therefore systematically described and classified both 16 lncRNA function prediction models and 9 lncRNA functional similarity calculation models into 8 types for highlighting their core algorithm and process. Finally, we concluded with discussions about the advantages and limitations of these computational models and future directions of lncRNA function prediction and functional similarity calculation. We believe that constructing systematic functional annotation systems is essential to strengthen the prediction accuracy of computational models, which will accelerate the identification process of novel lncRNA functions in the future.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Simulación por Computador , Enfermedad/genética , Redes Reguladoras de Genes , ARN Largo no Codificante/genética , Humanos
18.
BMC Med Genomics ; 11(Suppl 6): 113, 2018 Dec 31.
Artículo en Inglés | MEDLINE | ID: mdl-30598112

RESUMEN

BACKGROUND: Current knowledge and data on miRNA-lncRNA interactions is still limited and little effort has been made to predict target lncRNAs of miRNAs. Accumulating evidences suggest that the interaction patterns between lncRNAs and miRNAs are closely related to relative expression level, forming a titration mechanism. It could provide an effective approach for characteristic feature extraction. In addition, using the coding non-coding co-expression network and sequence data could also help to measure the similarities among miRNAs and lncRNAs. By mathematically analyzing these types of similarities, we come up with two findings that (i) lncRNAs/miRNAs tend to collaboratively interact with miRNAs/lncRNAs of similar expression profiles, and vice versa, and (ii) those miRNAs interacting with a cluster of common target genes tend to jointly target at the common lncRNAs. METHODS: In this work, we developed a novel group preference Bayesian collaborative filtering model called GBCF for picking up a top-k probability ranking list for an individual miRNA or lncRNA based on the known miRNA-lncRNA interaction network. RESULTS: To evaluate the effectiveness of GBCF, leave-one-out and k-fold cross validations as well as a series of comparison experiments were carried out. GBCF achieved the values of area under ROC curve of 0.9193, 0.8354+/- 0.0079, 0.8615+/- 0.0078, and 0.8928+/- 0.0082 based on leave-one-out, 2-fold, 5-fold, and 10-fold cross validations respectively, demonstrating its reliability and robustness. CONCLUSIONS: GBCF could be used to select potential lncRNA targets of specific miRNAs and offer great insights for further researches on ceRNA regulation network.


Asunto(s)
Redes Reguladoras de Genes , MicroARNs/metabolismo , Modelos Genéticos , ARN Largo no Codificante/metabolismo , Teorema de Bayes , Biología Computacional , Bases de Datos Genéticas , Humanos , Datos de Secuencia Molecular , Transcriptoma
19.
BMC Syst Biol ; 12(Suppl 9): 121, 2018 12 31.
Artículo en Inglés | MEDLINE | ID: mdl-30598090

RESUMEN

BACKGROUND: MicroRNA (miRNA) plays a key role in regulation mechanism of human biological processes, including the development of disease and disorder. It is necessary to identify potential miRNA biomarkers for various human diseases. Computational prediction model is expected to accelerate the process of identification. RESULTS: Considering the limitations of previously proposed models, we present a novel computational model called FMSM. It infers latent miRNA biomarkers involved in the mechanism of various diseases based on the known miRNA-disease association network, miRNA expression similarity, disease semantic similarity and Gaussian interaction profile kernel similarity. FMSM achieves reliable prediction performance in 5-fold and leave-one-out cross validations with area under ROC curve (AUC) values of 0.9629+/- 0.0127 and 0.9433, respectively, which outperforms the state-of-the-art competitors and classical algorithms. In addition, 19 of top 25 predicted miRNAs have been validated to have associations with Colonic Neoplasms in case study. CONCLUSIONS: A factored miRNA similarity based model and miRNA expression similarity substantially contribute to the well-performing prediction. The list of the predicted most latent miRNA biomarkers of various human diseases is publicized. It is anticipated that FMSM could serve as a useful tool guiding the future experimental validation for those promising miRNA biomarker candidates.


Asunto(s)
Biología Computacional/métodos , Simulación por Computador , Enfermedad/genética , Marcadores Genéticos/genética , MicroARNs/genética , Humanos
20.
Sci Rep ; 7(1): 7601, 2017 08 08.
Artículo en Inglés | MEDLINE | ID: mdl-28790448

RESUMEN

An increasing number of evidences indicate microbes are implicated in human physiological mechanisms, including complicated disease pathology. Some microbes have been demonstrated to be associated with diverse important human diseases or disorders. Through investigating these disease-related microbes, we can obtain a better understanding of human disease mechanisms for advancing medical scientific progress in terms of disease diagnosis, treatment, prevention, prognosis and drug discovery. Based on the known microbe-disease association network, we developed a semi-supervised computational model of Laplacian Regularized Least Squares for Human Microbe-Disease Association (LRLSHMDA) by introducing Gaussian interaction profile kernel similarity calculation and Laplacian regularized least squares classifier. LRLSHMDA reached the reliable AUCs of 0.8909 and 0.7657 based on the global and local leave-one-out cross validations, respectively. In the framework of 5-fold cross validation, average AUC value of 0.8794 +/-0.0029 further demonstrated its promising prediction ability. In case studies, 9, 9 and 8 of top-10 predicted microbes have been manually certified to be associated with asthma, colorectal carcinoma and chronic obstructive pulmonary disease by published literature evidence. Our proposed model achieves better prediction performance relative to the previous model. We expect that LRLSHMDA could offer insights into identifying more promising human microbe-disease associations in the future.


Asunto(s)
Asma/microbiología , Carcinoma/microbiología , Neoplasias Colorrectales/microbiología , Microbioma Gastrointestinal/genética , Modelos Estadísticos , Enfermedad Pulmonar Obstructiva Crónica/microbiología , Actinobacteria/clasificación , Actinobacteria/genética , Actinobacteria/aislamiento & purificación , Algoritmos , Asma/diagnóstico , Asma/patología , Carcinoma/diagnóstico , Carcinoma/patología , Clostridiaceae/clasificación , Clostridiaceae/genética , Clostridiaceae/aislamiento & purificación , Neoplasias Colorrectales/diagnóstico , Neoplasias Colorrectales/patología , Comamonadaceae/clasificación , Comamonadaceae/genética , Comamonadaceae/aislamiento & purificación , Bases de Datos Factuales , Firmicutes/clasificación , Firmicutes/genética , Firmicutes/aislamiento & purificación , Humanos , Análisis de los Mínimos Cuadrados , Oxalobacteraceae/clasificación , Oxalobacteraceae/genética , Oxalobacteraceae/aislamiento & purificación , Pronóstico , Enfermedad Pulmonar Obstructiva Crónica/diagnóstico , Enfermedad Pulmonar Obstructiva Crónica/patología , Sphingomonadaceae/clasificación , Sphingomonadaceae/genética , Sphingomonadaceae/aislamiento & purificación
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA