RESUMO
Herbs applicability in disease treatment has been verified through experiences over thousands of years. The understanding of herb-disease associations (HDAs) is yet far from complete due to the complicated mechanism inherent in multi-target and multi-component (MTMC) botanical therapeutics. Most of the existing prediction models fail to incorporate the MTMC mechanism. To overcome this problem, we propose a novel dual-channel hypergraph convolutional network, namely HGHDA, for HDA prediction. Technically, HGHDA first adopts an autoencoder to project components and target protein onto a low-dimensional latent space so as to obtain their embeddings by preserving similarity characteristics in their original feature spaces. To model the high-order relations between herbs and their components, we design a channel in HGHDA to encode a hypergraph that describes the high-order patterns of herb-component relations via hypergraph convolution. The other channel in HGHDA is also established in the same way to model the high-order relations between diseases and target proteins. The embeddings of drugs and diseases are then aggregated through our dual-channel network to obtain the prediction results with a scoring function. To evaluate the performance of HGHDA, a series of extensive experiments have been conducted on two benchmark datasets, and the results demonstrate the superiority of HGHDA over the state-of-the-art algorithms proposed for HDA prediction. Besides, our case study on Chuan Xiong and Astragalus membranaceus is a strong indicator to verify the effectiveness of HGHDA, as seven and eight out of the top 10 diseases predicted by HGHDA for Chuan-Xiong and Astragalus-membranaceus, respectively, have been reported in literature.
Assuntos
Algoritmos , Astragalus propinquus , Benchmarking , CarbamatosRESUMO
As microRNAs (miRNAs) are involved in many essential biological processes, their abnormal expressions can serve as biomarkers and prognostic indicators to prevent the development of complex diseases, thus providing accurate early detection and prognostic evaluation. Although a number of computational methods have been proposed to predict miRNA-disease associations (MDAs) for further experimental verification, their performance is limited primarily by the inadequacy of exploiting lower order patterns characterizing known MDAs to identify missing ones from MDA networks. Hence, in this work, we present a novel prediction model, namely HiSCMDA, by incorporating higher order network structures for improved performance of MDA prediction. To this end, HiSCMDA first integrates miRNA similarity network, disease similarity network and MDA network to preserve the advantages of all these networks. After that, it identifies overlapping functional modules from the integrated network by predefining several higher order connectivity patterns of interest. Last, a path-based scoring function is designed to infer potential MDAs based on network paths across related functional modules. HiSCMDA yields the best performance across all datasets and evaluation metrics in the cross-validation and independent validation experiments. Furthermore, in the case studies, 49 and 50 out of the top 50 miRNAs, respectively, predicted for colon neoplasms and lung neoplasms have been validated by well-established databases. Experimental results show that rich higher order organizational structures exposed in the MDA network gain new insight into the MDA prediction based on higher order connectivity patterns.
Assuntos
Neoplasias do Colo , Neoplasias Pulmonares , MicroRNAs , Humanos , MicroRNAs/genética , MicroRNAs/metabolismo , Biologia Computacional/métodos , Neoplasias Pulmonares/genética , Bases de Dados Factuais , Algoritmos , Predisposição Genética para DoençaRESUMO
The outbreak of COVID-19 caused by SARS-coronavirus (CoV)-2 has made millions of deaths since 2019. Although a variety of computational methods have been proposed to repurpose drugs for treating SARS-CoV-2 infections, it is still a challenging task for new viruses, as there are no verified virus-drug associations (VDAs) between them and existing drugs. To efficiently solve the cold-start problem posed by new viruses, a novel constrained multi-view nonnegative matrix factorization (CMNMF) model is designed by jointly utilizing multiple sources of biological information. With the CMNMF model, the similarities of drugs and viruses can be preserved from their own perspectives when they are projected onto a unified latent feature space. Based on the CMNMF model, we propose a deep learning method, namely VDA-DLCMNMF, for repurposing drugs against new viruses. VDA-DLCMNMF first initializes the node representations of drugs and viruses with their corresponding latent feature vectors to avoid a random initialization and then applies graph convolutional network to optimize their representations. Given an arbitrary drug, its probability of being associated with a new virus is computed according to their representations. To evaluate the performance of VDA-DLCMNMF, we have conducted a series of experiments on three VDA datasets created for SARS-CoV-2. Experimental results demonstrate that the promising prediction accuracy of VDA-DLCMNMF. Moreover, incorporating the CMNMF model into deep learning gains new insight into the drug repurposing for SARS-CoV-2, as the results of molecular docking experiments reveal that four antiviral drugs identified by VDA-DLCMNMF have the potential ability to treat SARS-CoV-2 infections.
Assuntos
Antivirais , Tratamento Farmacológico da COVID-19 , COVID-19 , Aprendizado Profundo , Reposicionamento de Medicamentos , Simulação de Acoplamento Molecular , SARS-CoV-2 , Antivirais/química , Antivirais/farmacocinética , COVID-19/metabolismo , Humanos , SARS-CoV-2/química , SARS-CoV-2/metabolismoRESUMO
Drug-drug interactions (DDIs) are known as the main cause of life-threatening adverse events, and their identification is a key task in drug development. Existing computational algorithms mainly solve this problem by using advanced representation learning techniques. Though effective, few of them are capable of performing their tasks on biomedical knowledge graphs (KGs) that provide more detailed information about drug attributes and drug-related triple facts. In this work, an attention-based KG representation learning framework, namely DDKG, is proposed to fully utilize the information of KGs for improved performance of DDI prediction. In particular, DDKG first initializes the representations of drugs with their embeddings derived from drug attributes with an encoder-decoder layer, and then learns the representations of drugs by recursively propagating and aggregating first-order neighboring information along top-ranked network paths determined by neighboring node embeddings and triple facts. Last, DDKG estimates the probability of being interacting for pairwise drugs with their representations in an end-to-end manner. To evaluate the effectiveness of DDKG, extensive experiments have been conducted on two practical datasets with different sizes, and the results demonstrate that DDKG is superior to state-of-the-art algorithms on the DDI prediction task in terms of different evaluation metrics across all datasets.
Assuntos
Redes Neurais de Computação , Reconhecimento Automatizado de Padrão , Algoritmos , Interações Medicamentosas , Bases de ConhecimentoRESUMO
Identifying new indications for drugs plays an essential role at many phases of drug research and development. Computational methods are regarded as an effective way to associate drugs with new indications. However, most of them complete their tasks by constructing a variety of heterogeneous networks without considering the biological knowledge of drugs and diseases, which are believed to be useful for improving the accuracy of drug repositioning. To this end, a novel heterogeneous information network (HIN) based model, namely HINGRL, is proposed to precisely identify new indications for drugs based on graph representation learning techniques. More specifically, HINGRL first constructs a HIN by integrating drug-disease, drug-protein and protein-disease biological networks with the biological knowledge of drugs and diseases. Then, different representation strategies are applied to learn the features of nodes in the HIN from the topological and biological perspectives. Finally, HINGRL adopts a Random Forest classifier to predict unknown drug-disease associations based on the integrated features of drugs and diseases obtained in the previous step. Experimental results demonstrate that HINGRL achieves the best performance on two real datasets when compared with state-of-the-art models. Besides, our case studies indicate that the simultaneous consideration of network topology and biological knowledge of drugs and diseases allows HINGRL to precisely predict drug-disease associations from a more comprehensive perspective. The promising performance of HINGRL also reveals that the utilization of rich heterogeneous information provides an alternative view for HINGRL to identify novel drug-disease associations especially for new diseases.
Assuntos
Serviços de Informação , Aprendizado de Máquina , Preparações Farmacêuticas , Algoritmos , Biologia Computacional/métodos , Doença , Reposicionamento de Medicamentos/métodos , Humanos , Modelos Teóricos , Redes Neurais de ComputaçãoRESUMO
Drug repositioning (DR) is a promising strategy to discover new indicators of approved drugs with artificial intelligence techniques, thus improving traditional drug discovery and development. However, most of DR computational methods fall short of taking into account the non-Euclidean nature of biomedical network data. To overcome this problem, a deep learning framework, namely DDAGDL, is proposed to predict drug-drug associations (DDAs) by using geometric deep learning (GDL) over heterogeneous information network (HIN). Incorporating complex biological information into the topological structure of HIN, DDAGDL effectively learns the smoothed representations of drugs and diseases with an attention mechanism. Experiment results demonstrate the superior performance of DDAGDL on three real-world datasets under 10-fold cross-validation when compared with state-of-the-art DR methods in terms of several evaluation metrics. Our case studies and molecular docking experiments indicate that DDAGDL is a promising DR tool that gains new insights into exploiting the geometric prior knowledge for improved efficacy.
Assuntos
Aprendizado Profundo , Reposicionamento de Medicamentos , Reposicionamento de Medicamentos/métodos , Inteligência Artificial , Simulação de Acoplamento Molecular , Serviços de Informação , Algoritmos , Biologia Computacional/métodosRESUMO
MOTIVATION: Interaction between transcription factor (TF) and its target genes establishes the knowledge foundation for biological researches in transcriptional regulation, the number of which is, however, still limited by biological techniques. Existing computational methods relevant to the prediction of TF-target interactions are mostly proposed for predicting binding sites, rather than directly predicting the interactions. To this end, we propose here a graph attention-based autoencoder model to predict TF-target gene interactions using the information of the known TF-target gene interaction network combined with two sequential and chemical gene characters, considering that the unobserved interactions between transcription factors and target genes can be predicted by learning the pattern of the known ones. To the best of our knowledge, the proposed model is the first attempt to solve this problem by learning patterns from the known TF-target gene interaction network. RESULTS: In this paper, we formulate the prediction task of TF-target gene interactions as a link prediction problem on a complex knowledge graph and propose a deep learning model called GraphTGI, which is composed of a graph attention-based encoder and a bilinear decoder. We evaluated the prediction performance of the proposed method on a real dataset, and the experimental results show that the proposed model yields outstanding performance with an average AUC value of 0.8864 +/- 0.0057 in the 5-fold cross-validation. It is anticipated that the GraphTGI model can effectively and efficiently predict TF-target gene interactions on a large scale. AVAILABILITY: Python code and the datasets used in our studies are made available at https://github.com/YanghanWu/GraphTGI.
Assuntos
Redes Neurais de ComputaçãoRESUMO
While the technologies of ribonucleic acid-sequence (RNA-seq) and transcript assembly analysis have continued to improve, a novel topology of RNA transcript was uncovered in the last decade and is called circular RNA (circRNA). Recently, researchers have revealed that they compete with messenger RNA (mRNA) and long noncoding for combining with microRNA in gene regulation. Therefore, circRNA was assumed to be associated with complex disease and discovering the relationship between them would contribute to medical research. However, the work of identifying the association between circRNA and disease in vitro takes a long time and usually without direction. During these years, more and more associations were verified by experiments. Hence, we proposed a computational method named identifying circRNA-disease association based on graph representation learning (iGRLCDA) for the prediction of the potential association of circRNA and disease, which utilized a deep learning model of graph convolution network (GCN) and graph factorization (GF). In detail, iGRLCDA first derived the hidden feature of known associations between circRNA and disease using the Gaussian interaction profile (GIP) kernel combined with disease semantic information to form a numeric descriptor. After that, it further used the deep learning model of GCN and GF to extract hidden features from the descriptor. Finally, the random forest classifier is introduced to identify the potential circRNA-disease association. The five-fold cross-validation of iGRLCDA shows strong competitiveness in comparison with other excellent prediction models at the gold standard data and achieved an average area under the receiver operating characteristic curve of 0.9289 and an area under the precision-recall curve of 0.9377. On reviewing the prediction results from the relevant literature, 22 of the top 30 predicted circRNA-disease associations were noted in recent published papers. These exceptional results make us believe that iGRLCDA can provide reliable circRNA-disease associations for medical research and reduce the blindness of wet-lab experiments.
Assuntos
MicroRNAs , RNA Circular , Algoritmos , Biologia Computacional/métodos , MicroRNAs/genética , Curva ROCRESUMO
MOTIVATION: The task of predicting drug-target interactions (DTIs) plays a significant role in facilitating the development of novel drug discovery. Compared with laboratory-based approaches, computational methods proposed for DTI prediction are preferred due to their high-efficiency and low-cost advantages. Recently, much attention has been attracted to apply different graph neural network (GNN) models to discover underlying DTIs from heterogeneous biological information network (HBIN). Although GNN-based prediction methods achieve better performance, they are prone to encounter the over-smoothing simulation when learning the latent representations of drugs and targets with their rich neighborhood information in HBIN, and thereby reduce the discriminative ability in DTI prediction. RESULTS: In this work, an improved graph representation learning method, namely iGRLDTI, is proposed to address the above issue by better capturing more discriminative representations of drugs and targets in a latent feature space. Specifically, iGRLDTI first constructs an HBIN by integrating the biological knowledge of drugs and targets with their interactions. After that, it adopts a node-dependent local smoothing strategy to adaptively decide the propagation depth of each biomolecule in HBIN, thus significantly alleviating over-smoothing by enhancing the discriminative ability of feature representations of drugs and targets. Finally, a Gradient Boosting Decision Tree classifier is used by iGRLDTI to predict novel DTIs. Experimental results demonstrate that iGRLDTI yields better performance that several state-of-the-art computational methods on the benchmark dataset. Besides, our case study indicates that iGRLDTI can successfully identify novel DTIs with more distinguishable features of drugs and targets. AVAILABILITY AND IMPLEMENTATION: Python codes and dataset are available at https://github.com/stevejobws/iGRLDTI/.
Assuntos
Descoberta de Drogas , Redes Neurais de Computação , Simulação por Computador , Descoberta de Drogas/métodos , Interações MedicamentosasRESUMO
Interactions between transcription factor and target gene form the main part of gene regulation network in human, which are still complicating factors in biological research. Specifically, for nearly half of those interactions recorded in established database, their interaction types are yet to be confirmed. Although several computational methods exist to predict gene interactions and their type, there is still no method available to predict them solely based on topology information. To this end, we proposed here a graph-based prediction model called KGE-TGI and trained in a multi-task learning manner on a knowledge graph that we specially constructed for this problem. The KGE-TGI model relies on topology information rather than being driven by gene expression data. In this paper, we formulate the task of predicting interaction types of transcript factor and target genes as a multi-label classification problem for link types on a heterogeneous graph, coupled with solving another link prediction problem that is inherently related. We constructed a ground truth dataset as benchmark and evaluated the proposed method on it. As a result of the 5-fold cross experiments, the proposed method achieved average AUC values of 0.9654 and 0.9339 in the tasks of link prediction and link type classification, respectively. In addition, the results of a series of comparison experiments also prove that the introduction of knowledge information significantly benefits to the prediction and that our methodology achieve state-of-the-art performance in this problem.
Assuntos
Reconhecimento Automatizado de Padrão , Fatores de Transcrição , Humanos , Bases de Dados Factuais , Fatores de Transcrição/genética , Redes Reguladoras de Genes , Proteoma , Algoritmos , Biologia de Sistemas , Ontologia GenéticaRESUMO
Discovering new indications for existing drugs is a promising development strategy at various stages of drug research and development. However, most of them complete their tasks by constructing a variety of heterogeneous networks without considering available higher-order connectivity patterns in heterogeneous biological information networks, which are believed to be useful for improving the accuracy of new drug discovering. To this end, we propose a computational-based model, called SFRLDDA, for drug-disease association prediction by using semantic graph and function similarity representation learning. Specifically, SFRLDDA first integrates a heterogeneous information network (HIN) by drug-disease, drug-protein, protein-disease associations, and their biological knowledge. Second, different representation learning strategies are applied to obtain the feature representations of drugs and diseases from different perspectives over semantic graph and function similarity graphs constructed, respectively. At last, a Random Forest classifier is incorporated by SFRLDDA to discover potential drug-disease associations (DDAs). Experimental results demonstrate that SFRLDDA yields a best performance when compared with other state-of-the-art models on three benchmark datasets. Moreover, case studies also indicate that the simultaneous consideration of semantic graph and function similarity of drugs and diseases in the HIN allows SFRLDDA to precisely predict DDAs in a more comprehensive manner.
Assuntos
Algoritmos , Semântica , Serviços de InformaçãoRESUMO
BACKGROUND: As an important task in bioinformatics, clustering analysis plays a critical role in understanding the functional mechanisms of many complex biological systems, which can be modeled as biological networks. The purpose of clustering analysis in biological networks is to identify functional modules of interest, but there is a lack of online clustering tools that visualize biological networks and provide in-depth biological analysis for discovered clusters. RESULTS: Here we present BioCAIV, a novel webserver dedicated to maximize its accessibility and applicability on the clustering analysis of biological networks. This, together with its user-friendly interface, assists biological researchers to perform an accurate clustering analysis for biological networks and identify functionally significant modules for further assessment. CONCLUSIONS: BioCAIV is an efficient clustering analysis webserver designed for a variety of biological networks. BioCAIV is freely available without registration requirements at http://bioinformatics.tianshanzw.cn:8888/BioCAIV/ .
Assuntos
Biologia Computacional , Software , Análise por ConglomeradosRESUMO
Proteins interact with each other to play critical roles in many biological processes in cells. Although promising, laboratory experiments usually suffer from the disadvantages of being time-consuming and labor-intensive. The results obtained are often not robust and considerably uncertain. Due recently to advances in high-throughput technologies, a large amount of proteomics data has been collected and this presents a significant opportunity and also a challenge to develop computational models to predict protein-protein interactions (PPIs) based on these data. In this paper, we present a comprehensive survey of the recent efforts that have been made towards the development of effective computational models for PPI prediction. The survey introduces the algorithms that can be used to learn computational models for predicting PPIs, and it classifies these models into different categories. To understand their relative merits, the paper discusses different validation schemes and metrics to evaluate the prediction performance. Biological databases that are commonly used in different experiments for performance comparison are also described and their use in a series of extensive experiments to compare different prediction models are discussed. Finally, we present some open issues in PPI prediction for future work. We explain how the performance of PPI prediction can be improved if these issues are effectively tackled.
Assuntos
Biologia Computacional/métodos , Mapeamento de Interação de Proteínas/métodos , Proteínas/metabolismo , Software , Máquina de Vetores de Suporte , Bases de Dados Genéticas , Bases de Dados de Proteínas , Ontologia Genética , Humanos , Modelos Moleculares , Conformação Proteica , Domínios e Motivos de Interação entre Proteínas , Mapeamento de Interação de Proteínas/estatística & dados numéricos , Proteínas/química , Proteínas/genética , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismoRESUMO
A topological insulator (TI) interfaced with an s-wave superconductor has been predicted to host topological superconductivity. Although the growth of epitaxial TI films on s-wave superconductors has been achieved by molecular-beam epitaxy, it remains an outstanding challenge for synthesizing atomically thin TI/superconductor heterostructures, which are critical for engineering the topological superconducting phase. Here we used molecular-beam epitaxy to grow Bi2Se3 films with a controlled thickness on monolayer NbSe2 and performed in situ angle-resolved photoemission spectroscopy and ex situ magnetotransport measurements on these heterostructures. We found that the emergence of Rashba-type bulk quantum-well bands and spin-non-degenerate surface states coincides with a marked suppression of the in-plane upper critical magnetic field of the superconductivity in Bi2Se3/monolayer NbSe2 heterostructures. This is a signature of a crossover from Ising- to Rashba-type superconducting pairings, induced by altering the Bi2Se3 film thickness. Our work opens a route for exploring a robust topological superconducting phase in TI/Ising superconductor heterostructures.
RESUMO
BACKGROUND: The site information of substrates that can be cleaved by human immunodeficiency virus 1 proteases (HIV-1 PRs) is of great significance for designing effective inhibitors against HIV-1 viruses. A variety of machine learning-based algorithms have been developed to predict HIV-1 PR cleavage sites by extracting relevant features from substrate sequences. However, only relying on the sequence information is not sufficient to ensure a promising performance due to the uncertainty in the way of separating the datasets used for training and testing. Moreover, the existence of noisy data, i.e., false positive and false negative cleavage sites, could negatively influence the accuracy performance. RESULTS: In this work, an ensemble learning algorithm for predicting HIV-1 PR cleavage sites, namely EM-HIV, is proposed by training a set of weak learners, i.e., biased support vector machine classifiers, with the asymmetric bagging strategy. By doing so, the impact of data imbalance and noisy data can thus be alleviated. Besides, in order to make full use of substrate sequences, the features used by EM-HIV are collected from three different coding schemes, including amino acid identities, chemical properties and variable-length coevolutionary patterns, for the purpose of constructing more relevant feature vectors of octamers. Experiment results on three independent benchmark datasets demonstrate that EM-HIV outperforms state-of-the-art prediction algorithm in terms of several evaluation metrics. Hence, EM-HIV can be regarded as a useful tool to accurately predict HIV-1 PR cleavage sites.
Assuntos
Protease de HIV , HIV-1 , Algoritmos , Protease de HIV/química , HIV-1/enzimologia , Aprendizado de Máquina , Especificidade por SubstratoRESUMO
BACKGROUND: Drug repositioning is a very important task that provides critical information for exploring the potential efficacy of drugs. Yet developing computational models that can effectively predict drug-disease associations (DDAs) is still a challenging task. Previous studies suggest that the accuracy of DDA prediction can be improved by integrating different types of biological features. But how to conduct an effective integration remains a challenging problem for accurately discovering new indications for approved drugs. METHODS: In this paper, we propose a novel meta-path based graph representation learning model, namely RLFDDA, to predict potential DDAs on heterogeneous biological networks. RLFDDA first calculates drug-drug similarities and disease-disease similarities as the intrinsic biological features of drugs and diseases. A heterogeneous network is then constructed by integrating DDAs, disease-protein associations and drug-protein associations. With such a network, RLFDDA adopts a meta-path random walk model to learn the latent representations of drugs and diseases, which are concatenated to construct joint representations of drug-disease associations. As the last step, we employ the random forest classifier to predict potential DDAs with their joint representations. RESULTS: To demonstrate the effectiveness of RLFDDA, we have conducted a series of experiments on two benchmark datasets by following a ten-fold cross-validation scheme. The results show that RLFDDA yields the best performance in terms of AUC and F1-score when compared with several state-of-the-art DDAs prediction models. We have also conducted a case study on two common diseases, i.e., paclitaxel and lung tumors, and found that 7 out of top-10 diseases and 8 out of top-10 drugs have already been validated for paclitaxel and lung tumors respectively with literature evidence. Hence, the promising performance of RLFDDA may provide a new perspective for novel DDAs discovery over heterogeneous networks.
Assuntos
Aprendizagem , Neoplasias Pulmonares , Humanos , Benchmarking , Descoberta de Drogas , PaclitaxelRESUMO
BACKGROUND: Protein-protein interaction (PPI) plays an important role in regulating cells and signals. Despite the ongoing efforts of the bioassay group, continued incomplete data limits our ability to understand the molecular roots of human disease. Therefore, it is urgent to develop a computational method to predict PPIs from the perspective of molecular system. METHODS: In this paper, a highly efficient computational model, MTV-PPI, is proposed for PPI prediction based on a heterogeneous molecular network by learning inter-view protein sequences and intra-view interactions between molecules simultaneously. On the one hand, the inter-view feature is extracted from the protein sequence by k-mer method. On the other hand, we use a popular embedding method LINE to encode the heterogeneous molecular network to obtain the intra-view feature. Thus, the protein representation used in MTV-PPI is constructed by the aggregation of its inter-view feature and intra-view feature. Finally, random forest is integrated to predict potential PPIs. RESULTS: To prove the effectiveness of MTV-PPI, we conduct extensive experiments on a collected heterogeneous molecular network with the accuracy of 86.55%, sensitivity of 82.49%, precision of 89.79%, AUC of 0.9301 and AUPR of 0.9308. Further comparison experiments are performed with various protein representations and classifiers to indicate the effectiveness of MTV-PPI in predicting PPIs based on a complex network. CONCLUSION: The achieved experimental results illustrate that MTV-PPI is a promising tool for PPI prediction, which may provide a new perspective for the future interactions prediction researches based on heterogeneous molecular network.
Assuntos
Mapeamento de Interação de Proteínas , Proteínas , Sequência de Aminoácidos , Biologia Computacional/métodos , Humanos , Mapeamento de Interação de Proteínas/métodos , Proteínas/metabolismoRESUMO
MOTIVATION: Clustering analysis in a biological network is to group biological entities into functional modules, thus providing valuable insight into the understanding of complex biological systems. Existing clustering techniques make use of lower-order connectivity patterns at the level of individual biological entities and their connections, but few of them can take into account of higher-order connectivity patterns at the level of small network motifs. RESULTS: Here, we present a novel clustering framework, namely HiSCF, to identify functional modules based on the higher-order structure information available in a biological network. Taking advantage of higher-order Markov stochastic process, HiSCF is able to perform the clustering analysis by exploiting a variety of network motifs. When compared with several state-of-the-art clustering models, HiSCF yields the best performance for two practical clustering applications, i.e. protein complex identification and gene co-expression module detection, in terms of accuracy. The promising performance of HiSCF demonstrates that the consideration of higher-order network motifs gains new insight into the analysis of biological networks, such as the identification of overlapping protein complexes and the inference of new signaling pathways, and also reveals the rich higher-order organizational structures presented in biological networks. AVAILABILITY AND IMPLEMENTATION: HiSCF is available at https://github.com/allenv5/HiSCF. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Algoritmos , Redes Reguladoras de Genes , Análise por Conglomerados , Expressão Gênica , Cadeias de MarkovRESUMO
In this Letter, we establish a new theoretical paradigm for vortex Majorana physics in the recently discovered topological iron-based superconductors (TFeSCs). While TFeSCs are widely accepted as an exemplar of topological insulators (TIs) with intrinsic s-wave superconductivity, our theory implies that such a common belief could be oversimplified. Our main finding is that the normal-state bulk Dirac nodes, usually ignored in TI-based vortex Majorana theories for TFeSCs, will play a key role of determining the vortex state topology. In particular, the interplay between TI and Dirac nodal bands will lead to multiple competing topological phases for a superconducting vortex line in TFeSCs, including an unprecedented hybrid topological vortex state that carries both Majorana bound states and a gapless dispersion. Remarkably, this exotic hybrid vortex phase generally exists in the vortex phase diagram for our minimal model for TFeSCs and is directly relevant to TFeSC candidates such as LiFeAs. When the fourfold rotation symmetry is broken by vortex-line tilting or curving, the hybrid vortex gets topologically trivialized and becomes Majorana free, which could explain the puzzle of ubiquitous trivial vortices observed in LiFeAs. The origin of the Majorana signal in other TFeSC candidates such as FeTe_{x}Se_{1-x} and CaKFe_{4}As_{4} is also interpreted within our theory framework. Our theory sheds new light on theoretically understanding and experimentally engineering Majorana physics in high-temperature iron-based systems.
RESUMO
In two-dimensional insulators with time-reversal (TR) symmetry, a nonzero local Berry curvature of low-energy massive Dirac fermions can give rise to nontrivial spin and charge responses, even though the integral of the Berry curvature over all occupied states is zero. In this Letter, we present a new effect induced by the electronic Berry curvature. By studying electron-phonon interactions in BaMnSb_{2}, a prototype two-dimensional Dirac material possessing two TR-related massive Dirac cones, we find that the nonzero local Berry curvature of electrons can induce a phonon angular momentum. The direction of this phonon angular momentum is locked to the phonon propagation direction, and thus we refer to it as "phonon helicity" in a way that is reminiscent of electron helicity in spin-orbit-coupled electronic systems. We discuss possible experimental probes of such phonon helicity.