Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 11 de 11
Filter
1.
Nucleic Acids Res ; 52(D1): D1418-D1428, 2024 Jan 05.
Article in English | MEDLINE | ID: mdl-37889037

ABSTRACT

Emerging CRISPR-Cas9 technology permits synthetic lethality (SL) screening of large number of gene pairs from gene combination double knockout (CDKO) experiments. However, the poor integration and annotation of CDKO SL data in current SL databases limit their utility, and diverse methods of calculating SL scores prohibit their comparison. To overcome these shortcomings, we have developed SL knowledge base (SLKB) that incorporates data of 11 CDKO experiments in 22 cell lines, 16,059 SL gene pairs and 264,424 non-SL gene pairs. Additionally, within SLKB, we have implemented five SL calculation methods: median score with and without background control normalization (Median-B/NB), sgRNA-derived score (sgRNA-B/NB), Horlbeck score, GEMINI score and MAGeCK score. The five scores have demonstrated a mere 1.21% overlap among their top 10% SL gene pairs, reflecting high diversity. Users can browse SL networks and assess the impact of scoring methods using Venn diagrams. The SL network generated from all data in SLKB shows a greater likelihood of SL gene pair connectivity with other SL gene pairs than non-SL pairs. Comparison of SL networks between two cell lines demonstrated greater likelihood to share SL hub genes than SL gene pairs. SLKB website and pipeline can be freely accessed at https://slkb.osubmi.org and https://slkb.docs.osubmi.org/, respectively.


Subject(s)
Knowledge Bases , Synthetic Lethal Mutations , Humans , RNA, Guide, CRISPR-Cas Systems , Internet Use
2.
J Biomed Semantics ; 14(1): 5, 2023 05 30.
Article in English | MEDLINE | ID: mdl-37248476

ABSTRACT

BACKGROUND: Drug-drug interaction (DDI) information retrieval (IR) is an important natural language process (NLP) task from the PubMed literature. For the first time, active learning (AL) is studied in DDI IR analysis. DDI IR analysis from PubMed abstracts faces the challenges of relatively small positive DDI samples among overwhelmingly large negative samples. Random negative sampling and positive sampling are purposely designed to improve the efficiency of AL analysis. The consistency of random negative sampling and positive sampling is shown in the paper. RESULTS: PubMed abstracts are divided into two pools. Screened pool contains all abstracts that pass the DDI keywords query in PubMed, while unscreened pool includes all the other abstracts. At a prespecified recall rate of 0.95, DDI IR analysis precision is evaluated and compared. In screened pool IR analysis using supporting vector machine (SVM), similarity sampling plus uncertainty sampling improves the precision over uncertainty sampling, from 0.89 to 0.92 respectively. In the unscreened pool IR analysis, the integrated random negative sampling, positive sampling, and similarity sampling improve the precision over uncertainty sampling along, from 0.72 to 0.81 respectively. When we change the SVM to a deep learning method, all sampling schemes consistently improve DDI AL analysis in both screened pool and unscreened pool. Deep learning has significant improvement of precision over SVM, 0.96 vs. 0.92 in screened pool, and 0.90 vs. 0.81 in the unscreened pool, respectively. CONCLUSIONS: By integrating various sampling schemes and deep learning algorithms into AL, the DDI IR analysis from literature is significantly improved. The random negative sampling and positive sampling are highly effective methods in improving AL analysis where the positive and negative samples are extremely imbalanced.


Subject(s)
Deep Learning , Information Storage and Retrieval , Algorithms , Drug Interactions , PubMed
3.
Front Genet ; 13: 961611, 2022.
Article in English | MEDLINE | ID: mdl-36531238

ABSTRACT

Synthetic lethality (SL) refers to a genetic interaction in which the simultaneous perturbation of two genes leads to cell or organism death, whereas viability is maintained when only one of the pair is altered. The experimental exploration of these pairs and predictive modeling in computational biology contribute to our understanding of cancer biology and the development of cancer therapies. We extensively reviewed experimental technologies, public data sources, and predictive models in the study of synthetic lethal gene pairs and herein detail biological assumptions, experimental data, statistical models, and computational schemes of various predictive models, speculate regarding their influence on individual sample- and population-based synthetic lethal interactions, discuss the pros and cons of existing SL data and models, and highlight potential research directions in SL discovery.

4.
STAR Protoc ; 3(3): 101556, 2022 09 16.
Article in English | MEDLINE | ID: mdl-36060092

ABSTRACT

Combinatorial CRISPR screening is useful for investigating synthetic lethality (SL) gene pairs. Here, we detail the steps for dual-gRNA library construction, with the introduction of two backbones, LentiGuide_DKO and LentiCRISPR_DKO. We describe steps for in vitro screening with 22Rv1-Cas9 and SaOS2-Cas9 cells followed by sequencing and data analysis. By introducing two backbones, we optimized the library construction process, facilitated standard pair-end sequencing, and provided options of screening on cells with or without modification of Cas9 expression.


Subject(s)
CRISPR-Cas Systems , RNA, Guide, Kinetoplastida , CRISPR-Cas Systems/genetics , Gene Library , Genes, Lethal , RNA, Guide, Kinetoplastida/genetics , Synthetic Lethal Mutations
5.
Front Genet ; 13: 1103092, 2022.
Article in English | MEDLINE | ID: mdl-36699450

ABSTRACT

Synthetic lethal (SL) genetic interactions have been regarded as a promising focus for investigating potential targeted therapeutics to tackle cancer. However, the costly investment of time and labor associated with wet-lab experimental screenings to discover potential SL relationships motivates the development of computational methods. Although graph neural network (GNN) models have performed well in the prediction of SL gene pairs, existing GNN-based models are not designed for predicting cancer cell-specific SL interactions that are more relevant to experimental validation in vitro. Besides, neither have existing methods fully utilized diverse graph representations of biological features to improve prediction performance. In this work, we propose MVGCN-iSL, a novel multi-view graph convolutional network (GCN) model to predict cancer cell-specific SL gene pairs, by incorporating five biological graph features and multi-omics data. Max pooling operation is applied to integrate five graph-specific representations obtained from GCN models. Afterwards, a deep neural network (DNN) model serves as the prediction module to predict the SL interactions in individual cancer cells (iSL). Extensive experiments have validated the model's successful integration of the multiple graph features and state-of-the-art performance in the prediction of potential SL gene pairs as well as generalization ability to novel genes.

6.
Brief Bioinform ; 22(6)2021 11 05.
Article in English | MEDLINE | ID: mdl-34347041

ABSTRACT

Drug combinations have exhibited promising therapeutic effects in treating cancer patients with less toxicity and adverse side effects. However, it is infeasible to experimentally screen the enormous search space of all possible drug combinations. Therefore, developing computational models to efficiently and accurately identify potential anti-cancer synergistic drug combinations has attracted a lot of attention from the scientific community. Hypothesis-driven explicit mathematical methods or network pharmacology models have been popular in the last decade and have been comprehensively reviewed in previous surveys. With the surge of artificial intelligence and greater availability of large-scale datasets, machine learning especially deep learning methods are gaining popularity in the field of computational models for anti-cancer drug synergy prediction. Machine learning-based methods can be derived without strong assumptions about underlying mechanisms and have achieved state-of-the-art prediction performances, promoting much greater growth of the field. Here, we present a structured overview of available large-scale databases and machine learning especially deep learning methods in computational predictive models for anti-cancer drug synergy prediction. We provide a unified framework for machine learning models and detail existing model architectures as well as their contributions and limitations, shedding light into the future design of computational models. Besides, unbiased experiments are conducted to provide in-depth comparisons between reviewed papers in terms of their prediction performance.


Subject(s)
Antineoplastic Agents/therapeutic use , Machine Learning , Antineoplastic Agents/administration & dosage , Datasets as Topic , Drug Therapy, Combination , Humans
7.
Front Genet ; 11: 807, 2020.
Article in English | MEDLINE | ID: mdl-33014009

ABSTRACT

Pseudogenes are indicating more and more functional potentials recently, though historically were regarded as relics of evolution. Computational methods for predicting pseudogene functions on Gene Ontology is important for directing experimental discovery. However, no pseudogene-specific computational methods have been proposed to directly predict their Gene Ontology (GO) terms. The biggest challenge for pseudogene function prediction is the lack of enough features and functional annotations, making training a predictive model difficult. Considering the close functional similarity between pseudogenes and their parent coding genes that share great amount of DNA sequence, as well as that coding genes have rich annotations, we aim to predict pseudogene functions by borrowing information from coding genes in a graph-based way. Here we propose Pseudo2GO, a graph-based deep learning semi-supervised model for pseudogene function prediction. A sequence similarity graph is first constructed to connect pseudogenes and coding genes. Multiple features are incorporated into the model as the node attributes to enable the graph an attributed graph, including expression profiles, interactions with microRNAs, protein-protein interactions (PPIs), and genetic interactions. Graph convolutional networks are used to propagate node attributes across the graph to make classifications on pseudogenes. Comparing Pseudo2GO with other frameworks adapted from popular protein function prediction methods, we demonstrated that our method has achieved state-of-the-art performance, significantly outperforming other methods in terms of the M-AUPR metric.

8.
Biology (Basel) ; 9(9)2020 Sep 07.
Article in English | MEDLINE | ID: mdl-32906805

ABSTRACT

In the prediction of the synergy of drug combinations, systems pharmacology models expand the scope of experiment screening and overcome the limitations of current computational models posed by their lack of mechanical interpretation and integration of gene essentiality. We therefore investigated the synergy of drug combinations for cancer therapies utilizing records in NCI ALMANAC, and we employed logistic regression to test the statistical significance of gene and pathway features in that interaction. We trained our predictive models using 43 NCI-60 cell lines, 165 KEGG pathways, and 114 drug pairs. Scores of drug-combination synergies showed a stronger correlation with pathway than gene features in overall trend analysis and a significant association with both genes and pathways in genome-wide association analyses. However, we observed little overlap of significant gene expressions and essentialities and no significant evidence that associated target and non-target genes and their pathways. We were able to validate four drug-combination pathways between two drug combinations, Nelarabine-Exemestane and Docetaxel-Vermurafenib, and two signaling pathways, PI3K-AKT and AMPK, in 16 cell lines. In conclusion, pathways significantly outperformed genes in predicting drug-combination synergy, and because they have very different mechanisms, gene expression and essentiality should be considered in combination rather than individually to improve this prediction.

9.
Gigascience ; 9(8)2020 08 01.
Article in English | MEDLINE | ID: mdl-32770210

ABSTRACT

BACKGROUND: Identifying protein functions is important for many biological applications. Since experimental functional characterization of proteins is time-consuming and costly, accurate and efficient computational methods for predicting protein functions are in great demand for generating the testable hypotheses guiding large-scale experiments." RESULTS: Here, we propose Graph2GO, a multi-modal graph-based representation learning model that can integrate heterogeneous information, including multiple types of interaction networks (sequence similarity network and protein-protein interaction network) and protein features (amino acid sequence, subcellular location, and protein domains) to predict protein functions on gene ontology. Comparing Graph2GO to BLAST, as a baseline model, and to two popular protein function prediction methods (Mashup and deepNF), we demonstrated that our model can achieve state-of-the-art performance. We show the robustness of our model by testing on multiple species. We also provide a web server supporting function query and downstream analysis on-the-fly. CONCLUSIONS: Graph2GO is the first model that has utilized attributed network representation learning methods to model both interaction networks and protein features for predicting protein functions, and achieved promising performance. Our model can be easily extended to include more protein features to further improve the performance. Besides, Graph2GO is also applicable to other application scenarios involving biological networks, and the learned latent representations can be used as feature inputs for machine learning tasks in various downstream analyses.


Subject(s)
Protein Interaction Maps , Proteins , Amino Acid Sequence , Gene Ontology , Machine Learning , Proteins/genetics , Proteins/metabolism
10.
BMC Bioinformatics ; 21(1): 323, 2020 Jul 21.
Article in English | MEDLINE | ID: mdl-32693790

ABSTRACT

BACKGROUND: Protein-protein interactions (PPIs) are central to many biological processes. Considering that the experimental methods for identifying PPIs are time-consuming and expensive, it is important to develop automated computational methods to better predict PPIs. Various machine learning methods have been proposed, including a deep learning technique which is sequence-based that has achieved promising results. However, it only focuses on sequence information while ignoring the structural information of PPI networks. Structural information of PPI networks such as their degree, position, and neighboring nodes in a graph has been proved to be informative in PPI prediction. RESULTS: Facing the challenge of representing graph information, we introduce an improved graph representation learning method. Our model can study PPI prediction based on both sequence information and graph structure. Moreover, our study takes advantage of a representation learning model and employs a graph-based deep learning method for PPI prediction, which shows superiority over existing sequence-based methods. Statistically, Our method achieves state-of-the-art accuracy of 99.15% on Human protein reference database (HPRD) dataset and also obtains best results on Database of Interacting Protein (DIP) Human, Drosophila, Escherichia coli (E. coli), and Caenorhabditis elegans (C. elegan) datasets. CONCLUSION: Here, we introduce signed variational graph auto-encoder (S-VGAE), an improved graph representation learning method, to automatically learn to encode graph structure into low-dimensional embeddings. Experimental results demonstrate that our method outperforms other existing sequence-based methods on several datasets. We also prove the robustness of our model for very sparse networks and the generalization for a new dataset that consists of four datasets: HPRD, E.coli, C.elegan, and Drosophila.


Subject(s)
Protein Interaction Mapping/methods , Animals , Caenorhabditis elegans/metabolism , Computer Simulation , Databases, Protein , Drosophila/metabolism , Escherichia coli/metabolism , Humans , Machine Learning , Neural Networks, Computer
11.
J Theor Biol ; 437: 202-213, 2018 01 21.
Article in English | MEDLINE | ID: mdl-29111420

ABSTRACT

Owing to its viscoelastic nature, tendon exhibits stress rate-dependent breaking and stiffness function. A Kelvin-Voigt viscoelastic shear lag model is proposed to illustrate the micromechanical behavior of the tendon under dynamic tensile conditions. Theoretical closed-form expressions are derived to predict the deformation and stress transfer between fibrils and interfibrillar matrix while tendon is dynamically stretched. The results from the analytical solutions demonstrate that how the fibril overlap length and fibril volume fraction affect the stress transfer and mechanical properties of tendon. We find that the viscoelastic property of interfibrillar matrix mainly results in collagen fibril failure under fast loading rate or creep rupture of tendon. However, discontinuous fibril model and hierarchical structure of tendon ensure relative sliding under slow loading rate, helping dissipate energy and protecting fibril from damage, which may be a key reason why regularly staggering alignment microstructure is widely selected in nature. According to the growth, injury, healing and healed process of tendon observed by many researchers, the conclusions presented in this paper agrees well with the experimental findings. Additionally, the emphasis of this paper is on micromechanical behavior of tendon, whereas this analytical viscoelastic shear lag model can be equally applicable to other soft or hard tissues, owning the similar microstructure.


Subject(s)
Algorithms , Biomechanical Phenomena/physiology , Models, Biological , Tendons/physiology , Animals , Elasticity , Humans , Stress, Mechanical , Tensile Strength/physiology , Viscosity
SELECTION OF CITATIONS
SEARCH DETAIL
...