Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 34
Filtrar
1.
BMC Bioinformatics ; 23(1): 86, 2022 Mar 05.
Artigo em Inglês | MEDLINE | ID: mdl-35247965

RESUMO

BACKGROUND: To date, cancer still is one of the leading causes of death worldwide, in which the cumulative of genes carrying mutations was said to be held accountable for the establishment and development of this disease mainly. From that, identification and analysis of driver genes were vital. Our previous study indicated disagreement on a unifying pipeline for these tasks and then introduced a complete one. However, this pipeline gradually manifested its weaknesses as being unfamiliar to non-technical users, time-consuming, and inconvenient. RESULTS: This study presented an R package named DrGA, developed based on our previous pipeline, to tackle the mentioned problems above. It wholly automated four widely used downstream analyses for predicted driver genes and offered additional improvements. We described the usage of the DrGA on driver genes of human breast cancer. Besides, we also gave the users another potential application of DrGA in analyzing genomic biomarkers of a complex disease in another organism. CONCLUSIONS: DrGA facilitated the users with limited IT backgrounds and rapidly created consistent and reproducible results. DrGA and its applications, along with example data, were freely provided at https://github.com/huynguyen250896/DrGA .


Assuntos
Neoplasias da Mama , Oncogenes , Biomarcadores Tumorais , Neoplasias da Mama/genética , Feminino , Humanos , Mutação
2.
BMC Genomics ; 23(1): 39, 2022 Jan 08.
Artigo em Inglês | MEDLINE | ID: mdl-34998362

RESUMO

BACKGROUND: When it comes to the co-expressed gene module detection, its typical challenges consist of overlap between identified modules and local co-expression in a subset of biological samples. The nature of module detection is the use of unsupervised clustering approaches and algorithms. Those methods are advanced undoubtedly, but the selection of a certain clustering method for sample- and gene-clustering tasks is separate, in which the latter task is often more complicated. RESULTS: This study presented an R-package, Overlapping CoExpressed gene Module (oCEM), armed with the decomposition methods to solve the challenges above. We also developed a novel auxiliary statistical approach to select the optimal number of principal components using a permutation procedure. We showed that oCEM outperformed state-of-the-art techniques in the ability to detect biologically relevant modules additionally. CONCLUSIONS: oCEM helped non-technical users easily perform complicated statistical analyses and then gain robust results. oCEM and its applications, along with example data, were freely provided at https://github.com/huynguyen250896/oCEM .


Assuntos
Algoritmos , Redes Reguladoras de Genes , Análise por Conglomerados , Perfilação da Expressão Gênica
3.
BMC Bioinformatics ; 21(1): 244, 2020 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-32539680

RESUMO

BACKGROUND: The misregulation of microRNA (miRNA) has been shown to cause diseases. Recently, we have proposed a computational method based on a random walk framework on a miRNA-target gene network to predict disease-associated miRNAs. The prediction performance of our method is better than that of some existing state-of-the-art network- and machine learning-based methods since it exploits the mutual regulation between miRNAs and their target genes in the miRNA-target gene interaction networks. RESULTS: To facilitate the use of this method, we have developed a Cytoscape app, named RWRMTN, to predict disease-associated miRNAs. RWRMTN can work on any miRNA-target gene network. Highly ranked miRNAs are supported with evidence from the literature. They then can also be visualized based on the rankings and in relationships with the query disease and their target genes. In addition, automation functions are also integrated, which allow RWRMTN to be used in workflows from external environments. We demonstrate the ability of RWRMTN in predicting breast and lung cancer-associated miRNAs via workflows in Cytoscape and other environments. CONCLUSIONS: Considering a few computational methods have been developed as software tools for convenient uses, RWRMTN is among the first GUI-based tools for the prediction of disease-associated miRNAs which can be used in workflows in different environments.


Assuntos
Biologia Computacional/métodos , Redes Reguladoras de Genes/genética , MicroRNAs/genética , Humanos
4.
Acta Biotheor ; 66(4): 315-331, 2018 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-29700660

RESUMO

Computational drug repositioning has been proven as a promising and efficient strategy for discovering new uses from existing drugs. To achieve this goal, a number of computational methods have been proposed, which are based on different data sources of drugs and diseases. These methods approach the problem using either machine learning- or network-based models with an assumption that similar drugs can be used for similar diseases to identify new indications of drugs. Therefore, similarities between drugs and between diseases are usually used as inputs. In addition, known drug-disease associations are also needed for the methods as prior information. It should be noted that those associations are still not well established due to the fact that many of marketed drugs have been withdrawn and this could affect the outcome of the methods. In this study, we propose a novel method named RLSDR (Regularized Least Square for Drug Repositioning) to find new uses of drugs. More specifically, it relies on a semi-supervised learning model, Regularized Least Square, thus it does not require definition of non-drug-disease associations as previously proposed machine learning-based methods. In addition, the similarity between drugs measured by chemical structures of drug compounds and the similarity between diseases which share phenotypes can be represented in a form of either similarity network or similarity matrix as inputs of the method. Moreover, instead of using a gold-standard set of known drug-disease associations, we construct an artificial set of the associations based on known disease-gene and drug-target associations. Experiment results demonstrate that RLSDR achieves better prediction performance on the artificial set of drug-disease associations than that on the gold-standard ones in terms of area under the Receiver Operating Characteristic (ROC) curve (AUC). In addition, it outperforms two representative network-based methods irrespective of the prior information of drug-disease associations. Novel indications for a number of drugs are also identified and validated by evidences from a different data resource.


Assuntos
Biologia Computacional/métodos , Reposicionamento de Medicamentos , Farmacologia/métodos , Aprendizado de Máquina Supervisionado , Área Sob a Curva , Tratamento Farmacológico/métodos , Humanos , Modelos Estatísticos , Preparações Farmacêuticas/química , Farmácia/métodos , Reprodutibilidade dos Testes , Software
5.
BMC Bioinformatics ; 18(1): 479, 2017 Nov 14.
Artigo em Inglês | MEDLINE | ID: mdl-29137601

RESUMO

BACKGROUND: MicroRNAs (miRNAs) have been shown to play an important role in pathological initiation, progression and maintenance. Because identification in the laboratory of disease-related miRNAs is not straightforward, numerous network-based methods have been developed to predict novel miRNAs in silico. Homogeneous networks (in which every node is a miRNA) based on the targets shared between miRNAs have been widely used to predict their role in disease phenotypes. Although such homogeneous networks can predict potential disease-associated miRNAs, they do not consider the roles of the target genes of the miRNAs. Here, we introduce a novel method based on a heterogeneous network that not only considers miRNAs but also the corresponding target genes in the network model. RESULTS: Instead of constructing homogeneous miRNA networks, we built heterogeneous miRNA networks consisting of both miRNAs and their target genes, using databases of known miRNA-target gene interactions. In addition, as recent studies demonstrated reciprocal regulatory relations between miRNAs and their target genes, we considered these heterogeneous miRNA networks to be undirected, assuming mutual miRNA-target interactions. Next, we introduced a novel method (RWRMTN) operating on these mutual heterogeneous miRNA networks to rank candidate disease-related miRNAs using a random walk with restart (RWR) based algorithm. Using both known disease-associated miRNAs and their target genes as seed nodes, the method can identify additional miRNAs involved in the disease phenotype. Experiments indicated that RWRMTN outperformed two existing state-of-the-art methods: RWRMDA, a network-based method that also uses a RWR on homogeneous (rather than heterogeneous) miRNA networks, and RLSMDA, a machine learning-based method. Interestingly, we could relate this performance gain to the emergence of "disease modules" in the heterogeneous miRNA networks used as input for the algorithm. Moreover, we could demonstrate that RWRMTN is stable, performing well when using both experimentally validated and predicted miRNA-target gene interaction data for network construction. Finally, using RWRMTN, we identified 76 novel miRNAs associated with 23 disease phenotypes which were present in a recent database of known disease-miRNA associations. CONCLUSIONS: Summarizing, using random walks on mutual miRNA-target networks improves the prediction of novel disease-associated miRNAs because of the existence of "disease modules" in these networks.


Assuntos
Doença/genética , Redes Reguladoras de Genes , MicroRNAs/metabolismo , Algoritmos , Humanos
6.
Bioinformatics ; 29(5): 630-7, 2013 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-23335016

RESUMO

MOTIVATION: Many studies have investigated the relationship between structural properties and dynamic behaviors in biological networks. In particular, feedback loop (FBL) and feedforward loop (FFL) structures have received a great deal of attention. One interesting and common property of FBL and FFL structures is their coherency of coupling. However, the role of coherent FFLs in relation to network robustness is not fully known, whereas that of coherent FBLs has been well established. RESULTS: To establish that coherent FFLs are abundant in biological networks, we examined gene regulatory and signaling networks and found that FFLs are ubiquitous, and are in a coherently coupled form. This result was also observed in the species-based signaling networks that are integrated from KEGG database. By using a random Boolean network model, we demonstrated that these coherent FFLs can improve network robustness against update-rule perturbations. In particular, we found that coherent FFLs increase robustness because these structures induce downstream nodes to be robust against update-rule perturbations. Therefore, coherent FFLs can be considered as a design principle of human signaling networks that improve network robustness against update-rule perturbations.


Assuntos
Retroalimentação Fisiológica , Redes Reguladoras de Genes , Transdução de Sinais , Humanos , Modelos Biológicos
7.
Bioinformatics ; 27(8): 1113-20, 2011 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-21325303

RESUMO

MOTIVATION: In general, diseases are more likely to be comorbid if they share associated genes or molecular interactions in a cellular process. However, there are still a number of pairs of diseases which show relatively high comorbidity but do not share any associated genes or interactions. This observation raises the need for a novel factor which can explain the underlying mechanism of comorbidity. We here consider a feedback loop (FBL) structure ubiquitously found in the human cell signaling network as a key motif to explain the comorbidity phenomenon, since it is well known to have effects on network dynamics. RESULTS: For every pair of diseases, we examined its comorbidity and length of all FBLs involved by the disease-associated genes in the human cell signaling network. We found that there is a negative relationship between comorbidity and length of involved FBLs. This indicates that a disease pair is more likely to comorbid if they are connected with FBLs of shorter length. We additionally showed that such a negative relationship is more obvious when the number of positive involved FBLs is larger than that of negative involved FBLs. Moreover, we observed that the negative relationship between comorbidity and length of involved FBLs holds especially for disease pairs that do not share any disease-associated genes. Finally, we proved all these results through intensive simulations, based on a Boolean network model. CONTACT: kwonyk@ulsan.ac.kr SUPPLEMENTARY INFORMATION: Supplementary data are available at BioInformatics online.


Assuntos
Comorbidade , Retroalimentação Fisiológica , Transdução de Sinais , Doença/genética , Humanos
8.
Bioinformatics ; 27(19): 2767-8, 2011 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-21828085

RESUMO

SUMMARY: NetDS is a novel Cytoscape plugin that conveniently simulates dynamics related to robustness, and examines structural properties with respect to feedforward/feedback loops. It can evaluate how robustly a network sustains a stable state against mutations by employing a Boolean network model. In addition, the plugin can examine all feedforward/feedback loops appearing in a network and determine whether or not a pair of loops is coupled. Random networks can also be generated to evaluate whether or not an interesting finding in real biological networks is significantly random. AVAILABILITY: NetDS is freely available for non-commercial purposes at http://netds.sourceforge.net/. CONTACT: kwonyk@ulsan.ac.kr SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Retroalimentação Fisiológica , Redes e Vias Metabólicas/fisiologia , Modelos Biológicos , Algoritmos , Transdução de Sinais , Software
9.
Front Mol Biosci ; 9: 801931, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35237657

RESUMO

It has been evident that N6-methyladenosine (m6A)-modified long noncoding RNAs (m6A-lncRNAs) involves regulating tumorigenesis, invasion, and metastasis for various cancer types. In this study, we sought to pick computationally up a set of 13 hub m6A-lncRNAs in light of three state-of-the-art tools WGCNA, iWGCNA, and oCEM, and interrogated their prognostic values in brain low-grade gliomas (LGG). Of the 13 hub m6A-lncRNAs, we further detected three hub m6A-lncRNAs as independent prognostic risk factors, including HOXB-AS1, ELOA-AS1, and FLG-AS1. Then, the m6ALncSig model was built based on these three hub m6A-lncRNAs. Patients with LGG next were divided into two groups, high- and low-risk, based on the median m6ALncSig score. As predicted, the high-risk group was more significantly related to mortality. The prognostic signature of m6ALncSig was validated using internal and external cohorts. In summary, our work introduces a high-confidence prognostic prediction signature and paves the way for using m6A-lncRNAs in the signature as new targets for treatment of LGG.

10.
PLoS One ; 17(9): e0275347, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36178928

RESUMO

BACKGROUND: Sediment scour at downstream of hydraulic structures is one of the main reasons threatening its stability. Several soil properties and initial input data have been studied to investigate its influence on scour hole geometry by both physical and numerical models. However, parameters of resistance affecting sedimentation and erosion phenomena have not been carried out in the literature. Besides, the auxiliary-like wing walls prevalently used in many real applications have been rarely addressed for their effect on morphological change. RESULTS: In this study, a 3D Computational Fluid Dynamics model is utilized to calibrate the hydraulic characteristics of steady flow going through the culvert by comparison with experimental data, showing good agreement between water depth, velocity, and pressure profiles at the bottom of the boxed culvert. The results show that a grid cell of 0.015 m gave minimum NRMSE and MAE values in test cases. Another approach is numerical testing sediment scour at a meander flume outlet with a variety of roughness/d50 ratio (cs) and diversion wall types. The findings include the following: cs = 2.5 indicates the close agreement between the numerical and analytical results of maximum scour depth after the culvert; the influence of four types of wing wall on the geometrical deformation including erosion at the concave bank and deposition at the convex bank of the meander flume outlet; and two short headwalls represent the best solution that accounts for small morphological changes. CONCLUSIONS: The influence of the roughness parameter of soil material and headwall types on sediment scour at the meander exit channel of hydraulic structure can be estimated by the numerical model.


Assuntos
Sedimentos Geológicos , Movimentos da Água , Animais , Hidrodinâmica , Solo , Água
11.
Artigo em Inglês | MEDLINE | ID: mdl-33606633

RESUMO

BACKGROUND: Drug response prediction is an important problem in computational personalized medicine. Many machine-learning-based methods, especially deep learning-based ones, have been proposed for this task. However, these methods often represent the drugs as strings, which are not a natural way to depict molecules. Also, interpretation (e.g., what are the mutation or copy number aberration contributing to the drug response) has not been considered thoroughly. METHODS: In this study, we propose a novel method, GraphDRP, based on graph convolutional network for the problem. In GraphDRP, drugs were represented in molecular graphs directly capturing the bonds among atoms, meanwhile cell lines were depicted as binary vectors of genomic aberrations. Representative features of drugs and cell lines were learned by convolution layers, then combined to represent for each drug-cell line pair. Finally, the response value of each drug-cell line pair was predicted by a fully-connected neural network. Four variants of graph convolutional networks were used for learning the features of drugs. RESULTS: We found that GraphDRP outperforms tCNNS in all performance measures for all experiments. Also, through saliency maps of the resulting GraphDRP models, we discovered the contribution of the genomic aberrations to the responses. CONCLUSION: Representing drugs as graphs can improve the performance of drug response prediction. Availability of data and materials: Data and source code can be downloaded athttps://github.com/hauldhut/GraphDRP.


Assuntos
Redes Neurais de Computação , Preparações Farmacêuticas , Genômica , Aprendizado de Máquina , Software
12.
Artigo em Inglês | MEDLINE | ID: mdl-34260355

RESUMO

Previous studies have either learned drug's features from their string or numeric representations, which are not natural forms of drugs, or only used genomic data of cell lines for the drug response prediction problem. Here, we proposed a deep learning model, GraOmicDRP, to learn drug's features from their graph representation and integrate multiple -omic data of cell lines. In GraOmicDRP, drugs are represented as graphs of bindings among atoms; meanwhile, cell lines are depicted by not only genomic but also transcriptomic and epigenomic data. Graph convolutional and convolutional neural networks were used to learn the representation of drugs and cell lines, respectively. A combination of the two representations was then used to be representative of each pair of drug-cell line. Finally, the response value of each pair was predicted by a fully connected network. Experimental results indicate that transcriptomic data shows the best among single -omic data; meanwhile, the combinations of transcriptomic and other -omic data achieved the best performance overall in terms of both Root Mean Square Error and Pearson correlation coefficient. In addition, we also show that GraOmicDRP outperforms some state-of-the-art methods, including ones integrating -omic data with drug information such as GraphDRP, and ones using -omic data without drug information such as DeepDR and MOLI.


Assuntos
Genômica , Redes Neurais de Computação , Linhagem Celular
13.
PLoS One ; 16(12): e0260432, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34879086

RESUMO

BACKGROUND: Enhancers regulate transcription of target genes, causing a change in expression level. Thus, the aberrant activity of enhancers can lead to diseases. To date, a large number of enhancers have been identified, yet a small portion of them have been found to be associated with diseases. This raises a pressing need to develop computational methods to predict associations between diseases and enhancers. RESULTS: In this study, we assumed that enhancers sharing target genes could be associated with similar diseases to predict the association. Thus, we built an enhancer functional interaction network by connecting enhancers significantly sharing target genes, then developed a network diffusion method RWDisEnh, based on a random walk with restart algorithm, on networks of diseases and enhancers to globally measure the degree of the association between diseases and enhancers. RWDisEnh performed best when the disease similarities are integrated with the enhancer functional interaction network by known disease-enhancer associations in the form of a heterogeneous network of diseases and enhancers. It was also superior to another network diffusion method, i.e., PageRank with Priors, and a neighborhood-based one, i.e., MaxLink, which simply chooses the closest neighbors of known disease-associated enhancers. Finally, we showed that RWDisEnh could predict novel enhancers, which are either directly or indirectly associated with diseases. CONCLUSIONS: Taken together, RWDisEnh could be a potential method for predicting disease-enhancer associations.


Assuntos
Biologia Computacional/métodos , Doença/genética , Elementos Facilitadores Genéticos , Algoritmos , Predisposição Genética para Doença , Humanos , Redes Neurais de Computação , Transcrição Gênica
14.
Curr Protoc ; 1(4): e115, 2021 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-33900688

RESUMO

The rapid growth of biomedical ontologies observed in recent years has been reported to be useful in various applications. In this article, we propose two main-function protocols-term-related and entity-related-with the three most common ontology analyses, including similarity calculation, enrichment analysis, and ontology visualization, which can be done by separate methods. Many previously developed tools implementing those methods run on different platforms and implement a limited number of the methods for similarity calculation and enrichment analysis tools for a specific type of biomedical ontology, although any type can be acceptable. Moreover, depending on each application, methods have distinct advantages; thus, the greater the number of methods a tool has, the better decisions that users make. The protocol here implements all the analyses above using an advanced popular tool called UFO. UFO is a Cytoscape app that unifies most of the semantic similarity measures for between-term and between-entity similarity calculation for biomedical ontologies in OBO format, which can calculate the similarity between two sets of entities and weigh imported entity networks, as well as generate functional similarity networks. The complete protocol can be performed in 30 min and is designed for use by biologists with no prior bioinformatics training. © 2021 Wiley Periodicals LLC. Basic Protocol: Running UFO using a list of input Gene Ontology, Disease Ontology, or Human Phenotype Ontology data.


Assuntos
Ontologias Biológicas , Biologia Computacional , Testes Diagnósticos de Rotina , Ontologia Genética , Humanos , Semântica
15.
Front Oncol ; 11: 731548, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34745953

RESUMO

Uveal melanoma (UM) is a comparatively rare cancer but requires serious consideration since patients with developing metastatic UM survive only for about 6-12 months. Fortunately, increasingly large multi-omics databases allow us to further understand cancer initiation and development. Moreover, previous studies have observed that associations between copy number aberrations (CNA) or methylation (MET) versus messenger RNA (mRNA) expression have affected these processes. From that, we decide to explore the effect of these associations on a case study of UM. Also, the current subtypes of UM display its weak association with biological phenotypes and its lack of therapy suggestions. Therefore, the re-identification of molecular subtypes is a pressing need. In this study, we recruit three omics profiles, including CNA, MET, and mRNA, in a UM cohort from The Cancer Genome Atlas (TCGA). Firstly, we identify two sets of genes, CNAexp and METexp, whose CNA and MET significantly correlated with their corresponding mRNA, respectively. Then, single and integrative analyses of the three data types are performed using the PINSPlus tool. As a result, we discover two novel integrative subgroups, IntSub1 and IntSub2, which could be a useful alternative classification for UM patients in the future. To further explore molecular events behind each subgroup, we identify their subgroup-specific genes computationally. Accordingly, the highest expressed genes among IntSub1-specific genes are mostly enriched with immune-related processes. On the other hand, IntSub2-specific genes are highly associated with cellular cation homeostasis, which responds effectively to chemotherapy using ion channel inhibitor drugs. In addition, we detect that the two integrative subgroups show different age-related risks and survival rates. These discoveries can influence the frequency of metastatic surveillance and support medical practitioners to choose an appropriate treatment regime.

16.
PLoS One ; 15(7): e0235670, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32645039

RESUMO

BACKGROUND: Biomedical ontologies have been growing quickly and proven to be useful in many biomedical applications. Important applications of those data include estimating the functional similarity between ontology terms and between annotated biomedical entities, analyzing enrichment for a set of biomedical entities. Many semantic similarity calculation and enrichment analysis methods have been proposed for such applications. Also, a number of tools implementing the methods have been developed on different platforms. However, these tools have implemented a small number of the semantic similarity calculation and enrichment analysis methods for a certain type of biomedical ontology. Note that the methods can be applied to all types of biomedical ontologies. More importantly, each method can be dominant in different applications; thus, users have more choice with more number of methods implemented in tools. Also, more functions would facilitate their task with ontology. RESULTS: In this study, we developed a Cytoscape app, named UFO, which unifies most of the semantic similarity measures for between-term and between-entity similarity calculation for all types of biomedical ontologies in OBO format. Based on the similarity calculation, UFO can calculate the similarity between two sets of entities and weigh imported entity networks as well as generate functional similarity networks. Besides, it can perform enrichment analysis of a set of entities by different methods. Moreover, UFO can visualize structural relationships between ontology terms, annotating relationships between entities and terms, and functional similarity between entities. Finally, we demonstrated the ability of UFO through some case studies on finding the best semantic similarity measures for assessing the similarity between human disease phenotypes, constructing biomedical entity functional similarity networks for predicting disease-associated biomarkers, and performing enrichment analysis on a set of similar phenotypes. CONCLUSIONS: Taken together, UFO is expected to be a tool where biomedical ontologies can be exploited for various biomedical applications. AVAILABILITY: UFO is distributed as a Cytoscape app, and can be downloaded freely at Cytoscape App (http://apps.cytoscape.org/apps/ufo) for non-commercial use.


Assuntos
Ontologias Biológicas , Software , Biomarcadores , Testes Diagnósticos de Rotina , Humanos , Semântica , Vocabulário Controlado
17.
Brief Funct Genomics ; 19(5-6): 350-363, 2020 12 04.
Artigo em Inglês | MEDLINE | ID: mdl-32567652

RESUMO

Disease gene prediction is an essential issue in biomedical research. In the early days, annotation-based approaches were proposed for this problem. With the development of high-throughput technologies, interaction data between genes/proteins have grown quickly and covered almost genome and proteome; thus, network-based methods for the problem become prominent. In parallel, machine learning techniques, which formulate the problem as a classification, have also been proposed. Here, we firstly show a roadmap of the machine learning-based methods for the disease gene prediction. In the beginning, the problem was usually approached using a binary classification, where positive and negative training sample sets are comprised of disease genes and non-disease genes, respectively. The disease genes are ones known to be associated with diseases; meanwhile, non-disease genes were randomly selected from those not yet known to be associated with diseases. However, the later may contain unknown disease genes. To overcome this uncertainty of defining the non-disease genes, more realistic approaches have been proposed for the problem, such as unary and semi-supervised classification. Recently, more advanced methods, including ensemble learning, matrix factorization and deep learning, have been proposed for the problem. Secondly, 12 representative machine learning-based methods for the disease gene prediction were examined and compared in terms of prediction performance and running time. Finally, their advantages, disadvantages, interpretability and trust were also analyzed and discussed.


Assuntos
Aprendizado de Máquina , Algoritmos , Humanos
18.
Sci Rep ; 10(1): 20521, 2020 11 25.
Artigo em Inglês | MEDLINE | ID: mdl-33239644

RESUMO

The cumulative of genes carrying mutations is vital for the establishment and development of cancer. However, this driver gene exploring research line has selected and used types of tools and models of analysis unsystematically and discretely. Also, the previous studies may have neglected low-frequency drivers and seldom predicted subgroup specificities of identified driver genes. In this study, we presented an improved driver gene identification and analysis pipeline that comprises the four most widely focused analyses for driver genes: enrichment analysis, clinical feature association with expression profiles of identified driver genes as well as with their functional modules, and patient stratification by existing advanced computational tools integrating multi-omics data. The improved pipeline's general usability was demonstrated straightforwardly for breast cancer, validated by some independent databases. Accordingly, 31 validated driver genes, including four novel ones, were discovered. Subsequently, we detected cancer-related significantly enriched gene ontology terms and pathways, probable drug targets, two co-expressed modules associated significantly with several clinical features, such as number of positive lymph nodes, Nottingham prognostic index, and tumor stage, and two biologically distinct groups of BRCA patients. Data and source code of the case study can be downloaded at https://github.com/hauldhut/drivergene .


Assuntos
Genes Neoplásicos , Genômica/métodos , Neoplasias/genética , Regulação Neoplásica da Expressão Gênica , Ontologia Genética , Redes Reguladoras de Genes , Genes BRCA1 , Genes BRCA2 , Estudos de Associação Genética , Humanos , Mutação/genética , Software
19.
Front Genet ; 11: 574661, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33193681

RESUMO

The unprecedented proliferation of recent large-scale and multi-omics databases of cancers has given us many new insights into genomic and epigenomic deregulation in cancer discovery in general. However, we wonder whether or not there exists a systematic connection between copy number aberrations (CNA) and methylation (MET)? If so, what is the role of this connection in breast cancer (BRCA) tumorigenesis and progression? At the same time, the PAM50 intrinsic subtypes of BRCA have gained the most attention from BRCA experts. However, this classification system manifests its weaknesses including low accuracy as well as a possible lack of association with biological phenotypes, and even further investigations on their clinical utility were still needed. In this study, we performed an integrative analysis of three-omics profiles, CNA, MET, and mRNA expression, in two BRCA patient cohorts (one for discovery and another for validation) - to elucidate those complicated relationships. To this purpose, we first established a set of CNAcor and METcor genes, which had CNA and MET levels significantly correlated (and anti-correlated) with their corresponding expression levels, respectively. Next, to revisit the current classification of BRCA, we performed single and integrated clustering analyses using our clustering method PINSPlus. We then discovered two biologically distinct subgroups that could be an improved and refined classification system for breast cancer patients, which can be validated by a third-party data. Further studies were then performed and realized each-subgroup-specific genes and different interactions between each of the two identified subgroups with the age factor. These findings can show promise as diagnostic and prognostic values in BRCA, and a potential alternative to the PAM50 intrinsic subtypes in the future.

20.
PLoS One ; 15(6): e0229276, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32542016

RESUMO

Tyrosine is mainly degraded in the liver by a series of enzymatic reactions. Abnormal expression of the tyrosine catabolic enzyme tyrosine aminotransferase (TAT) has been reported in patients with hepatocellular carcinoma (HCC). Despite this, aberration in tyrosine metabolism has not been investigated in cancer development. In this work, we conduct comprehensive cross-platform study to obtain foundation for discoveries of potential therapeutics and preventative biomarkers of HCC. We explore data from The Cancer Genome Atlas (TCGA), Gene Expression Omnibus (GEO), Gene Expression Profiling Interactive Analysis (GEPIA), Oncomine and Kaplan Meier plotter (KM plotter) and performed integrated analyses to evaluate the clinical significance and prognostic values of the tyrosine catabolic genes in HCC. We find that five tyrosine catabolic enzymes are downregulated in HCC compared to normal liver at mRNA and protein level. Moreover, low expression of these enzymes correlates with poorer survival in patients with HCC. Notably, we identify pathways and upstream regulators that might involve in tyrosine catabolic reprogramming and further drive HCC development. In total, our results underscore tyrosine metabolism alteration in HCC and lay foundation for incorporating these pathway components in therapeutics and preventative strategies.


Assuntos
Biomarcadores Tumorais/metabolismo , Perfilação da Expressão Gênica , Neoplasias Hepáticas/patologia , Tirosina/metabolismo , Linhagem Celular Tumoral , Regulação para Baixo , Humanos , Neoplasias Hepáticas/genética , Neoplasias Hepáticas/metabolismo , MicroRNAs/genética , Mutação , Prognóstico
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa