Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 13 de 13
Filter
1.
Nucleic Acids Res ; 51(D1): D1242-D1248, 2023 01 06.
Article in English | MEDLINE | ID: mdl-36259664

ABSTRACT

Extensive in vitro cancer drug screening datasets have enabled scientists to identify biomarkers and develop machine learning models for predicting drug sensitivity. While most advancements have focused on omics profiles, cancer drug sensitivity scores precalculated by the original sources are often used as-is, without consideration for variabilities between studies. It is well-known that significant inconsistencies exist between the drug sensitivity scores across datasets due to differences in experimental setups and preprocessing methods used to obtain the sensitivity scores. As a result, many studies opt to focus only on a single dataset, leading to underutilization of available data and a limited interpretation of cancer pharmacogenomics analysis. To overcome these caveats, we have developed CREAMMIST (https://creammist.mtms.dev), an integrative database that enables users to obtain an integrative dose-response curve, to capture uncertainty (or high certainty when multiple datasets well align) across five widely used cancer cell-line drug-response datasets. We utilized the Bayesian framework to systematically integrate all available dose-response values across datasets (>14 millions dose-response data points). CREAMMIST provides easy-to-use statistics derived from the integrative dose-response curves for various downstream analyses such as identifying biomarkers, selecting drug concentrations for experiments, and training robust machine learning models.


Subject(s)
Antineoplastic Agents , Databases, Factual , Neoplasms , Humans , Antineoplastic Agents/pharmacology , Antineoplastic Agents/therapeutic use , Bayes Theorem , Biomarkers , Machine Learning , Neoplasms/drug therapy , Neoplasms/genetics
2.
Nat Commun ; 15(1): 3744, 2024 May 03.
Article in English | MEDLINE | ID: mdl-38702321

ABSTRACT

Cellular composition and anatomical organization influence normal and aberrant organ functions. Emerging spatial single-cell proteomic assays such as Image Mass Cytometry (IMC) and Co-Detection by Indexing (CODEX) have facilitated the study of cellular composition and organization by enabling high-throughput measurement of cells and their localization directly in intact tissues. However, annotation of cell types and quantification of their relative localization in tissues remain challenging. To address these unmet needs for atlas-scale datasets like Human Pancreas Analysis Program (HPAP), we develop AnnoSpat (Annotator and Spatial Pattern Finder) that uses neural network and point process algorithms to automatically identify cell types and quantify cell-cell proximity relationships. Our study of data from IMC and CODEX shows the higher performance of AnnoSpat in rapid and accurate annotation of cell types compared to alternative approaches. Moreover, the application of AnnoSpat to type 1 diabetic, non-diabetic autoantibody-positive, and non-diabetic organ donor cohorts recapitulates known islet pathobiology and shows differential dynamics of pancreatic polypeptide (PP) cell abundance and CD8+ T cells infiltration in islets during type 1 diabetes progression.


Subject(s)
Algorithms , Diabetes Mellitus, Type 1 , Pancreas , Proteomics , Humans , Proteomics/methods , Diabetes Mellitus, Type 1/pathology , Diabetes Mellitus, Type 1/metabolism , Pancreas/cytology , Pancreas/metabolism , Islets of Langerhans/metabolism , Islets of Langerhans/cytology , Single-Cell Analysis/methods , Neural Networks, Computer , CD8-Positive T-Lymphocytes/metabolism , Image Cytometry/methods
3.
bioRxiv ; 2023 Jan 18.
Article in English | MEDLINE | ID: mdl-36712052

ABSTRACT

Cellular composition and anatomical organization influence normal and aberrant organ functions. Emerging spatial single-cell proteomic assays such as Image Mass Cytometry (IMC) and Co-Detection by Indexing (CODEX) have facilitated the study of cellular composition and organization by enabling high-throughput measurement of cells and their localization directly in intact tissues. However, annotation of cell types and quantification of their relative localization in tissues remain challenging. To address these unmet needs, we developed AnnoSpat (Annotator and Spatial Pattern Finder) that uses neural network and point process algorithms to automatically identify cell types and quantify cell-cell proximity relationships. Our study of data from IMC and CODEX show the superior performance of AnnoSpat in rapid and accurate annotation of cell types compared to alternative approaches. Moreover, the application of AnnoSpat to type 1 diabetic, non-diabetic autoantibody-positive, and non-diabetic organ donor cohorts recapitulated known islet pathobiology and showed differential dynamics of pancreatic polypeptide (PP) cell abundance and CD8+ T cells infiltration in islets during type 1 diabetes progression.

4.
J Comput Biol ; 29(5): 441-452, 2022 05.
Article in English | MEDLINE | ID: mdl-35394368

ABSTRACT

This study formulates antiviral repositioning as a matrix completion problem wherein the antiviral drugs are along the rows and the viruses are along the columns. The input matrix is partially filled, with ones in positions where the antiviral drug has been known to be effective against a virus. The curated metadata for antivirals (chemical structure and pathways) and viruses (genomic structure and symptoms) are encoded into our matrix completion framework as graph Laplacian regularization. We then frame the resulting multiple graph regularized matrix completion (GRMC) problem as deep matrix factorization. This is solved by using a novel optimization method called HyPALM (Hybrid Proximal Alternating Linearized Minimization). Results of our curated RNA drug-virus association data set show that the proposed approach excels over state-of-the-art GRMC techniques. When applied to in silico prediction of antivirals for COVID-19, our approach returns antivirals that are either used for treating patients or are under trials for the same.


Subject(s)
COVID-19 Drug Treatment , Algorithms , Antiviral Agents/pharmacology , Antiviral Agents/therapeutic use , Humans
5.
IEEE/ACM Trans Comput Biol Bioinform ; 19(6): 3332-3339, 2022.
Article in English | MEDLINE | ID: mdl-35816539

ABSTRACT

Investigation of existing drugs is an effective alternative to the discovery of new drugs for treating diseases. This task of drug re-positioning can be assisted by various kinds of computational methods to predict the best indication for a drug given the open-source biological datasets. Owing to the fact that similar drugs tend to have common pathways and disease indications, the association matrix is assumed to be of low-rank structure. Hence, the problem of drug-disease association prediction can be modeled as a low-rank matrix completion problem. In this work, we propose a novel matrix completion framework that makes use of the side-information associated with drugs/diseases for the prediction of drug-disease indications modeled as neighborhood graph: Graph regularized 1-bit matrix completion (GR1BMC). The algorithm is specially designed for binary data and uses parallel proximal algorithm to solve the aforesaid minimization problem taking into account all the constraints including the neighborhood graph incorporation and restricting predicted scores within the specified range. The results have been validated on two standard databases by evaluating the AUC across the 10-fold cross-validation splits. The usage of the method is also evaluated through a case study where top 5 indications are predicted for novel drugs, which then are verified with the CTD database.


Subject(s)
Algorithms , Computational Biology , Computational Biology/methods , Research Design , Databases, Factual , Data Management
6.
Nat Metab ; 4(2): 284-299, 2022 02.
Article in English | MEDLINE | ID: mdl-35228745

ABSTRACT

Type 1 diabetes (T1D) is an autoimmune disease in which immune cells destroy insulin-producing beta cells. The aetiology of this complex disease is dependent on the interplay of multiple heterogeneous cell types in the pancreatic environment. Here, we provide a single-cell atlas of pancreatic islets of 24 T1D, autoantibody-positive and nondiabetic organ donors across multiple quantitative modalities including ~80,000 cells using single-cell transcriptomics, ~7,000,000 cells using cytometry by time of flight and ~1,000,000 cells using in situ imaging mass cytometry. We develop an advanced integrative analytical strategy to assess pancreatic islets and identify canonical cell types. We show that a subset of exocrine ductal cells acquires a signature of tolerogenic dendritic cells in an apparent attempt at immune suppression in T1D donors. Our multimodal analyses delineate cell types and processes that may contribute to T1D immunopathogenesis and provide an integrative procedure for exploration and discovery of human pancreatic function.


Subject(s)
Diabetes Mellitus, Type 1 , Insulin-Secreting Cells , Islets of Langerhans , Humans , Insulin-Secreting Cells/metabolism , Islets of Langerhans/metabolism , Pancreas/metabolism , Pancreatic Hormones/metabolism
7.
Sci Rep ; 11(1): 9047, 2021 04 27.
Article in English | MEDLINE | ID: mdl-33907209

ABSTRACT

The year 2020 witnessed a heavy death toll due to COVID-19, calling for a global emergency. The continuous ongoing research and clinical trials paved the way for vaccines. But, the vaccine efficacy in the long run is still questionable due to the mutating coronavirus, which makes drug re-positioning a reasonable alternative. COVID-19 has hence fast-paced drug re-positioning for the treatment of COVID-19 and its symptoms. This work builds computational models using matrix completion techniques to predict drug-virus association for drug re-positioning. The aim is to assist clinicians with a tool for selecting prospective antiviral treatments. Since the virus is known to mutate fast, the tool is likely to help clinicians in selecting the right set of antivirals for the mutated isolate. The main contribution of this work is a manually curated database publicly shared, comprising of existing associations between viruses and their corresponding antivirals. The database gathers similarity information using the chemical structure of drugs and the genomic structure of viruses. Along with this database, we make available a set of state-of-the-art computational drug re-positioning tools based on matrix completion. The tools are first analysed on a standard set of experimental protocols for drug target interactions. The best performing ones are applied for the task of re-positioning antivirals for COVID-19. These tools select six drugs out of which four are currently under various stages of trial, namely Remdesivir (as a cure), Ribavarin (in combination with others for cure), Umifenovir (as a prophylactic and cure) and Sofosbuvir (as a cure). Another unanimous prediction is Tenofovir alafenamide, which is a novel Tenofovir prodrug developed in order to improve renal safety when compared to its original counterpart (older version) Tenofovir disoproxil. Both are under trail, the former as a cure and the latter as a prophylactic. These results establish that the computational methods are in sync with the state-of-practice. We also demonstrate how the drugs to be used against the virus would vary as SARS-Cov-2 mutates over time by predicting the drugs for the mutated strains, suggesting the importance of such a tool in drug prediction. We believe this work would open up possibilities for applying machine learning models to clinical research for drug-virus association prediction and other similar biological problems.


Subject(s)
Antiviral Agents/therapeutic use , COVID-19 Drug Treatment , Algorithms , Area Under Curve , COVID-19/virology , Databases, Factual , Drug Repositioning , Evolution, Molecular , Humans , Mutation , ROC Curve , SARS-CoV-2/genetics , SARS-CoV-2/isolation & purification
8.
Genome Med ; 13(1): 189, 2021 12 16.
Article in English | MEDLINE | ID: mdl-34915921

ABSTRACT

While understanding molecular heterogeneity across patients underpins precision oncology, there is increasing appreciation for taking intra-tumor heterogeneity into account. Based on large-scale analysis of cancer omics datasets, we highlight the importance of intra-tumor transcriptomic heterogeneity (ITTH) for predicting clinical outcomes. Leveraging single-cell RNA-seq (scRNA-seq) with a recommender system (CaDRReS-Sc), we show that heterogeneous gene-expression signatures can predict drug response with high accuracy (80%). Using patient-proximal cell lines, we established the validity of CaDRReS-Sc's monotherapy (Pearson r>0.6) and combinatorial predictions targeting clone-specific vulnerabilities (>10% improvement). Applying CaDRReS-Sc to rapidly expanding scRNA-seq compendiums can serve as in silico screen to accelerate drug-repurposing studies. Availability: https://github.com/CSB5/CaDRReS-Sc .


Subject(s)
Neoplasms , Transcriptome , Clone Cells , Gene Expression Profiling , Humans , Neoplasms/drug therapy , Neoplasms/genetics , Precision Medicine , Sequence Analysis, RNA , Single-Cell Analysis , Software
9.
PLoS One ; 15(1): e0226484, 2020.
Article in English | MEDLINE | ID: mdl-31945078

ABSTRACT

The identification of potential interactions between drugs and target proteins is crucial in pharmaceutical sciences. The experimental validation of interactions in genomic drug discovery is laborious and expensive; hence, there is a need for efficient and accurate in-silico techniques which can predict potential drug-target interactions to narrow down the search space for experimental verification. In this work, we propose a new framework, namely, Multi-Graph Regularized Nuclear Norm Minimization, which predicts the interactions between drugs and target proteins from three inputs: known drug-target interaction network, similarities over drugs and those over targets. The proposed method focuses on finding a low-rank interaction matrix that is structured by the proximities of drugs and targets encoded by graphs. Previous works on Drug Target Interaction (DTI) prediction have shown that incorporating drug and target similarities helps in learning the data manifold better by preserving the local geometries of the original data. But, there is no clear consensus on which kind and what combination of similarities would best assist the prediction task. Hence, we propose to use various multiple drug-drug similarities and target-target similarities as multiple graph Laplacian (over drugs/targets) regularization terms to capture the proximities exhaustively. Extensive cross-validation experiments on four benchmark datasets using standard evaluation metrics (AUPR and AUC) show that the proposed algorithm improves the predictive performance and outperforms recent state-of-the-art computational methods by a large margin. Software is publicly available at https://github.com/aanchalMongia/MGRNNMforDTI.


Subject(s)
Algorithms , Computer Graphics , Drug Development/methods , Drug Discovery/methods , Drug Interactions , Pharmaceutical Preparations/metabolism , Proteins/metabolism , Computer Simulation , Humans , Pharmaceutical Preparations/chemistry , Proteins/chemistry
10.
J Comput Biol ; 27(7): 1011-1019, 2020 07.
Article in English | MEDLINE | ID: mdl-31657645

ABSTRACT

Single-cell RNA-seq has inspired new discoveries and innovation in the field of developmental and cell biology for the past few years and is useful for studying cellular responses at individual cell resolution. But, due to the paucity of starting RNA, the data acquired have dropouts. To address this, we propose a deep matrix factorization-based method, deepMc, to impute missing values in gene expression data. For the deep architecture of our approach, we draw our motivation from great success of deep learning in solving various machine learning problems. In this study, we support our method with positive results on several evaluation metrics such as clustering of cell populations, differential expression analysis, and cell type separability.


Subject(s)
Computational Biology/methods , Sequence Analysis, RNA/methods , Single-Cell Analysis/methods , Animals , Blastocyst/cytology , Deep Learning , HEK293 Cells , Humans , Jurkat Cells , Mice , Sequence Analysis, RNA/statistics & numerical data , Single-Cell Analysis/statistics & numerical data
11.
NAR Genom Bioinform ; 2(4): lqaa091, 2020 Dec.
Article in English | MEDLINE | ID: mdl-33575635

ABSTRACT

The advent of single-cell open-chromatin profiling technology has facilitated the analysis of heterogeneity of activity of regulatory regions at single-cell resolution. However, stochasticity and availability of low amount of relevant DNA, cause high drop-out rate and noise in single-cell open-chromatin profiles. We introduce here a robust method called as forest of imputation trees (FITs) to recover original signals from highly sparse and noisy single-cell open-chromatin profiles. FITs makes multiple imputation trees to avoid bias during the restoration of read-count matrices. It resolves the challenging issue of recovering open chromatin signals without blurring out information at genomic sites with cell-type-specific activity. Besides visualization and classification, FITs-based imputation also improved accuracy in the detection of enhancers, calculating pathway enrichment score and prediction of chromatin-interactions. FITs is generalized for wider applicability, especially for highly sparse read-count matrices. The superiority of FITs in recovering signals of minority cells also makes it highly useful for single-cell open-chromatin profile from in vivo samples. The software is freely available at https://reggenlab.github.io/FITs/.

12.
Front Genet ; 10: 9, 2019.
Article in English | MEDLINE | ID: mdl-30761179

ABSTRACT

Motivation: Single-cell RNA sequencing has been proved to be revolutionary for its potential of zooming into complex biological systems. Genome-wide expression analysis at single-cell resolution provides a window into dynamics of cellular phenotypes. This facilitates the characterization of transcriptional heterogeneity in normal and diseased tissues under various conditions. It also sheds light on the development or emergence of specific cell populations and phenotypes. However, owing to the paucity of input RNA, a typical single cell RNA sequencing data features a high number of dropout events where transcripts fail to get amplified. Results: We introduce mcImpute, a low-rank matrix completion based technique to impute dropouts in single cell expression data. On a number of real datasets, application of mcImpute yields significant improvements in the separation of true zeros from dropouts, cell-clustering, differential expression analysis, cell type separability, the performance of dimensionality reduction techniques for cell visualization, and gene distribution. Availability and Implementation: https://github.com/aanchalMongia/McImpute_scRNAseq.

13.
Sci Rep ; 8(1): 16329, 2018 11 05.
Article in English | MEDLINE | ID: mdl-30397240

ABSTRACT

The emergence of single-cell RNA sequencing (scRNA-seq) technologies has enabled us to measure the expression levels of thousands of genes at single-cell resolution. However, insufficient quantities of starting RNA in the individual cells cause significant dropout events, introducing a large number of zero counts in the expression matrix. To circumvent this, we developed an autoencoder-based sparse gene expression matrix imputation method. AutoImpute, which learns the inherent distribution of the input scRNA-seq data and imputes the missing values accordingly with minimal modification to the biologically silent genes. When tested on real scRNA-seq datasets, AutoImpute performed competitively wrt., the existing single-cell imputation methods, on the grounds of expression recovery from subsampled data, cell-clustering accuracy, variance stabilization and cell-type separability.


Subject(s)
Sequence Analysis, RNA/methods , Single-Cell Analysis/methods , Analysis of Variance , Automation , Cluster Analysis , Gene Expression Profiling
SELECTION OF CITATIONS
SEARCH DETAIL