RESUMEN
Drug-target interaction is crucial in the discovery of new drugs. Computational methods can be used to identify new drug-target interactions at low costs and with reasonable accuracy. Recent studies pay more attention to machine-learning methods, ranging from matrix factorization to deep learning, in the DTI prediction. Since the interaction matrix is often extremely sparse, DTI prediction performance is significantly decreased with matrix factorization-based methods. Therefore, some matrix factorization methods utilize side information to address both the sparsity issue of the interaction matrix and the cold-start issue. By combining matrix factorization and autoencoders, we propose a hybrid DTI prediction model that simultaneously learn the hidden factors of drugs and targets from their side information and interaction matrix. The proposed method is composed of two steps: the pre-processing of the interaction matrix, and the hybrid model. We leverage the similarity matrices of both drugs and targets to address the sparsity problem of the interaction matrix. The comparison of our approach against other algorithms on the same reference datasets has shown good results regarding area under receiver operating characteristic curve and the area under precision-recall curve. More specifically, experimental results achieve high accuracy on golden standard datasets (e.g., Nuclear Receptors, GPCRs, Ion Channels, and Enzymes) when performed with five repetitions of tenfold cross-validation. Display graphical of the hybrid model of Matrix Factorization with Denoising Autoencoders with the help side information of drugs and targets for Prediction of Drug-Target Interactions.
Asunto(s)
Algoritmos , Aprendizaje Automático , Interacciones Farmacológicas , Proyectos de Investigación , Curva ROCRESUMEN
A recommender system (RS) is highly efficient in extracting valuable information from a deluge of big data. The key issue of implementing an RS lies in uncovering users' latent preferences on different items. Latent Feature Analysis (LFA) and deep neural networks (DNNs) are two of the most popular and successful approaches to addressing this issue. However, both the LFA-based and the DNNs-based models have their own distinct advantages and disadvantages. Consequently, relying solely on either the LFA or DNN-based models cannot ensure optimal recommendation performance across diverse real-world application scenarios. To address this issue, this paper proposes a novel hybrid recommendation model that combines Autoencoder and LFA techniques, termed AutoLFA. The main idea of AutoLFA is two-fold: (1) It leverages an Autoencoder and an LFA model separately to construct two distinct recommendation models, each residing in a unique metric representation space with its own set of strengths; and (2) it integrates the Autoencoder and LFA model using a customized self-adaptive weighting strategy, thereby capitalizing on the merits of both approaches. To evaluate the proposed AutoLFA model, extensive experiments on five real recommendation datasets are conducted. The results demonstrate that AutoLFA achieves significantly better recommendation performance than the seven related state-of-the-art models.
RESUMEN
In the era of rapid development of the Internet of things, deep learning, and communication technologies, social media has become an indispensable element. However, while enjoying the convenience brought by technological innovation, people are also facing the negative impact brought by them. Taking the users' portraits of multimedia systems as examples, with the maturity of deep facial forgery technologies, personal portraits are facing malicious tampering and forgery, which pose a potential threat to personal privacy security and social impact. At present, the deep forgery detection methods are learning-based methods, which depend on the data to a certain extent. Enriching facial anti-spoofing datasets is an effective method to solve the above problem. Therefore, we propose an effective face swapping framework based on StyleGAN. We utilize the feature pyramid network to extract facial features and map them to the latent space of StyleGAN. In order to realize the transformation of identity, we explore the representation of identity information and propose an adaptive identity editing module. We design a simple and effective post-processing process to improve the authenticity of the images. Experiments show that our proposed method can effectively complete face swapping and provide high-quality data for deep forgery detection to ensure the security of multimedia systems.
Asunto(s)
Procesamiento de Imagen Asistido por Computador , Privacidad , Humanos , Procesamiento de Imagen Asistido por Computador/métodosRESUMEN
BACKGROUND: Identifying miRNA and disease associations helps us understand disease mechanisms of action from the molecular level. However, it is usually blind, time-consuming, and small-scale based on biological experiments. Hence, developing computational methods to predict unknown miRNA and disease associations is becoming increasingly important. RESULTS: In this work, we develop a computational framework called SMALF to predict unknown miRNA-disease associations. SMALF first utilizes a stacked autoencoder to learn miRNA latent feature and disease latent feature from the original miRNA-disease association matrix. Then, SMALF obtains the feature vector of representing miRNA-disease by integrating miRNA functional similarity, miRNA latent feature, disease semantic similarity, and disease latent feature. Finally, XGBoost is utilized to predict unknown miRNA-disease associations. We implement cross-validation experiments. Compared with other state-of-the-art methods, SAMLF achieved the best AUC value. We also construct three case studies, including hepatocellular carcinoma, colon cancer, and breast cancer. The results show that 10, 10, and 9 out of the top ten predicted miRNAs are verified in MNDR v3.0 or miRCancer, respectively. CONCLUSION: The comprehensive experimental results demonstrate that SMALF is effective in identifying unknown miRNA-disease associations.
Asunto(s)
Neoplasias de la Mama , MicroARNs , Algoritmos , Neoplasias de la Mama/genética , Biología Computacional , Humanos , MicroARNs/genéticaRESUMEN
BACKGROUND: Drug-target interaction (DTI) plays a vital role in drug discovery. Identifying drug-target interactions related to wet-lab experiments are costly, laborious, and time-consuming. Therefore, computational methods to predict drug-target interactions are an essential task in the drug discovery process. Meanwhile, computational methods can reduce search space by proposing potential drugs already validated on wet-lab experiments. Recently, deep learning-based methods in drug-target interaction prediction have gotten more attention. Traditionally, DTI prediction methods' performance heavily depends on additional information, such as protein sequence and molecular structure of the drug, as well as deep supervised learning. RESULTS: This paper proposes a method based on deep unsupervised learning for drug-target interaction prediction called AutoDTI++. The proposed method includes three steps. The first step is to pre-process the interaction matrix. Since the interaction matrix is sparse, we solved the sparsity of the interaction matrix with drug fingerprints. Then, in the second step, the AutoDTI approach is introduced. In the third step, we post-preprocess the output of the AutoDTI model. CONCLUSIONS: Experimental results have shown that we were able to improve the prediction performance. To this end, the proposed method has been compared to other algorithms using the same reference datasets. The proposed method indicates that the experimental results of running five repetitions of tenfold cross-validation on golden standard datasets (Nuclear Receptors, GPCRs, Ion channels, and Enzymes) achieve good performance with high accuracy.
Asunto(s)
Desarrollo de Medicamentos , Aprendizaje Automático no Supervisado , Algoritmos , Secuencia de Aminoácidos , Descubrimiento de DrogasRESUMEN
By incorporating a growing number of sensors and adopting machine learning technologies, wearable devices have recently become a prominent health care application domain. Among the related research topics in this field, one of the most important issues is detecting falls while walking. Since such falls may lead to serious injuries, automatically and promptly detecting them during daily use of smartphones and/or smart watches is a particular need. In this paper, we investigate the use of Gaussian process (GP) methods for characterizing dynamic walking patterns and detecting falls while walking with built-in wearable sensors in smartphones and/or smartwatches. For the task of characterizing dynamic walking patterns in a low-dimensional latent feature space, we propose a novel approach called auto-encoded Gaussian process dynamical model, in which we combine a GP-based state space modeling method with a nonlinear dimensionality reduction method in a unique manner. The Gaussian process methods are fit for this task because one of the most import strengths of the Gaussian process methods is its capability of handling uncertainty in the model parameters. Also for detecting falls while walking, we propose to recycle the latent samples generated in training the auto-encoded Gaussian process dynamical model for GP-based novelty detection, which can lead to an efficient and seamless solution to the detection task. Experimental results show that the combined use of these GP-based methods can yield promising results for characterizing dynamic walking patterns and detecting falls while walking with the wearable sensors.
RESUMEN
Introduction: A growing body of research indicates that microorganisms play a crucial role in human health. Imbalances in microbial communities are closely linked to human diseases, and identifying potential relationships between microbes and diseases can help elucidate the pathogenesis of diseases. However, traditional methods based on biological or clinical experiments are costly, so the use of computational models to predict potential microbe-disease associations is of great importance. Methods: In this paper, we present a novel computational model called MLFLHMDA, which is based on a Multi-View Latent Feature Learning approach to predict Human potential Microbe-Disease Associations. Specifically, we compute Gaussian interaction profile kernel similarity between diseases and microbes based on the known microbe-disease associations from the Human Microbe-Disease Association Database and perform a preprocessing step on the resulting microbe-disease association matrix, namely, weighting K nearest known neighbors (WKNKN) to reduce the sparsity of the microbe-disease association matrix. To obtain unobserved associations in the microbe and disease views, we extract different latent features based on the geometrical structure of microbes and diseases, and project multi-modal latent features into a common subspace. Next, we introduce graph regularization to preserve the local manifold structure of Gaussian interaction profile kernel similarity and add Lp,q-norms to the projection matrix to ensure the interpretability and sparsity of the model. Results: The AUC values for global leave-one-out cross-validation and 5-fold cross validation implemented by MLFLHMDA are 0.9165 and 0.8942+/-0.0041, respectively, which perform better than other existing methods. In addition, case studies of different diseases have demonstrated the superiority of the predictive power of MLFLHMDA. The source code of our model and the data are available on https://github.com/LiangzheZhang/MLFLHMDA_master.
RESUMEN
Generative models, such as Generative Adversarial Networks (GANs), have recently shown remarkable capabilities in various generation tasks. However, the success of these models heavily depends on the availability of a large-scale training dataset. When the size of the training dataset is limited, the quality and diversity of the generated results suffer from severe degradation. In this paper, we propose a novel approach, Reverse Contrastive Learning (RCL), to address the problem of high-quality and diverse image generation under few-shot settings. The success of RCL benefits from a two-sided, powerful regularization. Our proposed regularization is designed based on the correlation between generated samples, which can effectively utilize the latent feature information between different levels of samples. It does not require any auxiliary information or augmentation techniques. A series of qualitative and quantitative results show that our proposed method is superior to the existing State-Of-The-Art (SOTA) methods under the few-shot setting and is still competitive under the low-shot setting, showcasing the effectiveness of RCL. Code will be released upon acceptance at https://github.com/gouayao/RCL.
Asunto(s)
Aprendizaje Automático , Redes Neurales de la ComputaciónRESUMEN
Quantitative susceptibility mapping (QSM) is a post-processing technique for deriving tissue magnetic susceptibility distribution from MRI phase measurements. Deep learning (DL) algorithms hold great potential for solving the ill-posed QSM reconstruction problem. However, a significant challenge facing current DL-QSM approaches is their limited adaptability to magnetic dipole field orientation variations during training and testing. In this work, we propose a novel Orientation-Adaptive Latent Feature Editing (OA-LFE) module to learn the encoding of acquisition orientation vectors and seamlessly integrate them into the latent features of deep networks. Importantly, it can be directly Plug-and-Play (PnP) into various existing DL-QSM architectures, enabling reconstructions of QSM from arbitrary magnetic dipole orientations. Its effectiveness is demonstrated by combining the OA-LFE module into our previously proposed phase-to-susceptibility single-step instant QSM (iQSM) network, which was initially tailored for pure-axial acquisitions. The proposed OA-LFE-empowered iQSM, which we refer to as iQSM+, is trained in a simulated-supervised manner on a specially-designed simulation brain dataset. Comprehensive experiments are conducted on simulated and in vivo human brain datasets, encompassing subjects ranging from healthy individuals to those with pathological conditions. These experiments involve various MRI platforms (3T and 7T) and aim to compare our proposed iQSM+ against several established QSM reconstruction frameworks, including the original iQSM. The iQSM+ yields QSM images with significantly improved accuracies and mitigates artifacts, surpassing other state-of-the-art DL-QSM algorithms. The PnP OA-LFE module's versatility was further demonstrated by its successful application to xQSM, a distinct DL-QSM network for dipole inversion. In conclusion, this work introduces a new DL paradigm, allowing researchers to develop innovative QSM methods without requiring a complete overhaul of their existing architectures.
Asunto(s)
Encéfalo , Procesamiento de Imagen Asistido por Computador , Humanos , Procesamiento de Imagen Asistido por Computador/métodos , Encéfalo/diagnóstico por imagen , Redes Neurales de la Computación , Mapeo Encefálico/métodos , Imagen por Resonancia Magnética/métodos , AlgoritmosRESUMEN
Recent advances in remote sensing techniques provide a new horizon for monitoring the spatiotemporal variations of harmful algal blooms (HABs) using hyperspectral data in inland water. In this study, a hierarchical concatenated variational autoencoder (HCVAE) is proposed as an efficient and accurate deep learning (DL) based bio-optical model. To demonstrate its usefulness in retrieving algal pigments, the HCVAE is applied to bloom-prone regions in Daecheong Lake, South Korea. By abstracting the similarity between highly related features using layer-wise clique-based latent-feature extraction, HCVAE reduces the computational loads in deriving outputs while preventing performance degradation. Graph-based clique-detection uses information theory-based criteria to group the related reflectance spectra. Consequently, six latent features were extracted from 79 spectral bands to consist of a multilevel hierarchy of HCVAE that can simultaneously estimate concentrations of chlorophyll-a (Chl-a) and phycocyanin (PC). Despite the parsimonious model architecture, the Chl-a and PC concentrations estimated by HCVAE closely agree with the measured concentrations, with test R2 values of 0.76 and 0.82, respectively. In addition, spatial distribution maps of algal pigments obtained from HCVAE using drone-borne reflectance successfully capture the blooming spots. Based on its multilevel hierarchical architecture, HCVAE can provide the importance of latent features along with their individual wavelengths using Shapley additive explanations. The most important latent features covered the spectral regions associated with both Chl-a and PC. The lightweight neural network DNNsel, which uses only the spectral bands of highest importance in latent-feature extraction, performed comparably to HCVAE. The study results demonstrate the utility of the multilevel hierarchical architecture as a comprehensive assessment model for near-real-time drone-borne sensing of HABs. Moreover, HCVAE is applicable to a wide range of environmental big data, as it can handle numerous sets of features.
Asunto(s)
Cianobacterias , Aprendizaje Profundo , Dispositivos Aéreos No Tripulados , Monitoreo del Ambiente/métodos , Clorofila A , Floraciones de Algas Nocivas , Lagos , PlantasRESUMEN
The steady degeneration of neurons is the hallmark of neurodegenerative illnesses, which are, by definition, incurable. Corticobasal Syndrome (CS), Huntington's Disease (HD), Dementia, Amyotrophic Lateral Sclerosis (ALS), Progressive supranuclear palsy (PSP) and Parkinson's Disease (PD) are some of the common neurodegenerative diseases which has impacted millions of people, predominantly among the older population. Various computational techniques, including but not limited to machine learning, are emerging as discrimination and detection of neuro-related diseases. This research proposed a machine learning-based framework to correctly detect PD, HD, and ALS from the gait signals of subjects both in binary and multi-class detection environment. The detection approach proposed here combines the classification power of Naïve Bayes and Logistic Regression jointly in a modern UltraBoost ensemble framework. The proposed method is unique in its ability to detect neuro diseases with a small number of gait features. The proposed approach ascertains most essential gait features through three state-of-the-art feature selection schemes, infinite feature selection, infinite latent feature selection and Sigmis feature selection. It has been observed that the gait signal features of the subjects are identified through Infinite Feature Selection manifests better detection results than the features obtained through Infinite Latent Feature and Sigmis feature selection while detecting Parkinson's and Huntington's Disease in a multi-class environment. So far as the binary detection environment is concern, the Amyotrophic lateral sclerosis is detected with 99.1% detection accuracy using 18 Sigmis gait features, with 99.1% sensitivity and 98.9% specificity, respectively. Similarly, Huntington's disease was detected with 94.2% detection accuracy, 94.2% sensitivity, and 94.5% specificity using 5 Sigmis gait features. Finally, Parkinson's disease was detected with 98.4% sensitivity, specificity, and detection accuracy.
RESUMEN
Early detection and treatment of Alzheimer's Disease (AD) are significant. Recently, multi-modality imaging data have promoted the development of the automatic diagnosis of AD. This paper proposes a method based on latent feature fusion to make full use of multi-modality image data information. Specifically, we learn a specific projection matrix for each modality by introducing a binary label matrix and local geometry constraints and then project the original features of each modality into a low-dimensional target space. In this space, we fuse latent feature representations of different modalities for AD classification. The experimental results on Alzheimer's Disease Neuroimaging Initiative database demonstrate the proposed methods effectiveness in classifying AD.
Asunto(s)
Enfermedad de Alzheimer , Disfunción Cognitiva , Humanos , Enfermedad de Alzheimer/diagnóstico por imagen , Imagen Multimodal/métodos , Neuroimagen/métodos , Imagen por Resonancia Magnética/métodos , Tomografía de Emisión de Positrones/métodos , Disfunción Cognitiva/diagnósticoRESUMEN
Emerging evidence indicates that miRNAs have strong relationships with many human diseases. Investigating the associations will contribute to elucidating the activities of miRNAs and pathogenesis mechanisms, and providing new opportunities for disease diagnosis and drug discovery. Therefore, it is of significance to identify potential associations between miRNAs and diseases. The existing databases about the miRNA-disease associations (MDAs) only provide the known MDAs, which can be regarded as positive samples. However, the unknown MDAs are not sufficient to regard as reliable negative samples. To deal with this uncertainty, we proposed a convolutional neural network (CNN) framework, named DNRLCNN, based on a latent feature matrix extracted by only positive samples to predict MDAs. First, by only considering the positive samples into the calculation process, we captured the latent feature matrix for complex interactions between miRNAs and diseases in low-dimensional space. Then, we constructed a feature vector for each miRNA and disease pair based on the feature representation. Finally, we adopted a modified CNN for the feature vector to predict MDAs. As a result, our model achieves better performance than other state-of-the-art methods which based CNN in fivefold cross-validation on both miRNA-disease association prediction task (average AUC of 0.9030) and miRNA-phenotype association prediction task (average AUC of 0. 9442). In addition, we carried out case studies on two human diseases, and all the top-50 predicted miRNAs for lung neoplasms are confirmed by HMDD v3.2 and dbDEMC 2.0 databases, 98% of the top-50 predicted miRNAs for heart failure are confirmed. The experiment results show that our model has the capability of inferring potential disease-related miRNAs.
Asunto(s)
MicroARNs , Algoritmos , Biología Computacional/métodos , Predisposición Genética a la Enfermedad , Humanos , MicroARNs/genética , Redes Neurales de la ComputaciónRESUMEN
Motor imagery (MI) based brain-computer interfaces help patients with movement disorders to regain the ability to control external devices. Common spatial pattern (CSP) is a popular algorithm for feature extraction in decoding MI tasks. However, due to noise and nonstationarity in electroencephalography (EEG), it is not optimal to combine the corresponding features obtained from the traditional CSP algorithm. In this paper, we designed a novel CSP feature selection framework that combines the filter method and the wrapper method. We first evaluated the importance of every CSP feature by the infinite latent feature selection method. Meanwhile, we calculated Wasserstein distance between feature distributions of the same feature under different tasks. Then, we redefined the importance of every CSP feature based on two indicators mentioned above, which eliminates half of CSP features to create a new CSP feature subspace according to the new importance indicator. At last, we designed the improved binary gravitational search algorithm (IBGSA) by rebuilding its transfer function and applied IBGSA on the new CSP feature subspace to find the optimal feature set. To validate the proposed method, we conducted experiments on three public BCI datasets and performed a numerical analysis of the proposed algorithm for MI classification. The accuracies were comparable to those reported in related studies and the presented model outperformed other methods in literature on the same underlying data.
Asunto(s)
Interfaces Cerebro-Computador , Algoritmos , Electroencefalografía , Humanos , Imaginación , Procesamiento de Señales Asistido por ComputadorRESUMEN
In discovering disease etiology and pathogenesis, the associations between MicroRNAs (miRNAs) and diseases play a critical role. Given known miRNA-disease associations (MDAs), how to uncover potential MDAs is an important problem. To solve this problem, most of the existing methods regard known MDAs as positive samples and unknown ones as negative samples, and then predict possible MDAs by iteratively revising the negative samples. However, simply viewing unknown MDAs as negative samples introduces erroneous information, which may result in poor predication performance. To avoid such defects, we present a novel method using only positive samples to predict MDAs by latent features extraction (LFEMDA). We design a new approach to construct the miRNAs similarity matrix. LFEMDA integrates the disease similarity matrix, the known MDAs and the miRNAs similarity matrix to identify potential MDAs. By introducing miRNAs and diseases knowledge as the auxiliary variables, the method can converge to give the optimal solution in each iteration. We conduct experiments on high-association diseases and new diseases datasets, in which our method shows better performance than that of other methods. We also carry out a case study on breast neoplasms to further demonstrate the capacity of our method in uncovering potential MDAs.
Asunto(s)
Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo/métodos , MicroARNs/genética , Algoritmos , HumanosRESUMEN
Identifying the target genes of transcription factors (TFs) is one of the key factors to understand transcriptional regulation. However, our understanding of genome-wide TF targeting profile is limited due to the cost of large scale experiments and intrinsic complexity. Thus, computational prediction methods are useful to predict the unobserved associations. Here, we developed a new one-class collaborative filtering algorithm tREMAP that is based on regularized, weighted nonnegative matrix tri-factorization. The algorithm predicts unobserved target genes for TFs using known gene-TF associations and protein-protein interaction network. Our benchmark study shows that tREMAP significantly outperforms its counterpart REMAP, a bi-factorization-based algorithm, for transcription factor target gene prediction in all four performance metrics AUC, MAP, MPR, and HLU. When evaluated by independent data sets, the prediction accuracy is 37.8% on the top 495 predicted associations, an enrichment factor of 4.19 compared with the random guess. Furthermore, many of the predicted novel associations by tREMAP are supported by evidence from literature. Although we only use canonical TF-target gene interaction data in this study, tREMAP can be directly applied to tissue-specific data sets. tREMAP provides a framework to integrate multiple omics data for the further improvement of TF target gene prediction. Thus, tREMAP is a potentially useful tool in studying gene regulatory networks. The benchmark data set and the source code of tREMAP are freely available at https://github.com/hansaimlim/REMAP/tree/master/TriFacREMAP.
RESUMEN
In this article, we investigate the problem of detecting boundaries of DNA copy number variation (CNV) regions using the DNA-sequencing data from multiple subject samples. Genomic features along the linear realization of the actual genome are correlated, especially within vicinity of a locus, so are the sequencing reads along the genome. It is then crucial to take the correlated structure of such high-throughput genomic data into consideration when modeling DNA-sequencing data for CNV detection from statistical and computational viewpoints. We use the framework of a fused Lasso latent feature model to solve the problem, and propose a modified information criterion for selecting the tuning parameter when search for common CNVs is shared by multiple subjects. Simulation studies and application on multiple subjects' next-generation sequencing data, downloaded from the 1000 Genome Project, showed that the proposed approach can effectively identify individual CNVs of a single subject profile and common CNVs shared by multiple subjects.