Pesquisa | Secretaria de Estado da Saúde

1.

Marker-free characterization of full-length transcriptomes of single live circulating tumor cells.

Poonia, Sarita; Goel, Anurag; Chawla, Smriti; Bhattacharya, Namrata; Rai, Priyadarshini; Lee, Yi Fang; Yap, Yoon Sim; West, Jay; Bhagat, Ali Asgar; Tayal, Juhi; Mehta, Anurag; Ahuja, Gaurav; Majumdar, Angshul; Ramalingam, Naveen; Sengupta, Debarka.

Genome Res ; 33(1): 80-95, 2023 01.

Artigo em Inglês | MEDLINE | ID: mdl-36414416

RESUMO

The identification and characterization of circulating tumor cells (CTCs) are important for gaining insights into the biology of metastatic cancers, monitoring disease progression, and medical management of the disease. The limiting factor in the enrichment of purified CTC populations is their sparse availability, heterogeneity, and altered phenotypes relative to the primary tumor. Intensive research both at the technical and molecular fronts led to the development of assays that ease CTC detection and identification from peripheral blood. Most CTC detection methods based on single-cell RNA sequencing (scRNA-seq) use a mix of size selection, marker-based white blood cell (WBC) depletion, and antibodies targeting tumor-associated antigens. However, the majority of these methods either miss out on atypical CTCs or suffer from WBC contamination. We present unCTC, an R package for unbiased identification and characterization of CTCs from single-cell transcriptomic data. unCTC features many standard and novel computational and statistical modules for various analyses. These include a novel method of scRNA-seq clustering, named deep dictionary learning using k-means clustering cost (DDLK), expression-based copy number variation (CNV) inference, and combinatorial, marker-based verification of the malignant phenotypes. DDLK enables robust segregation of CTCs and WBCs in the pathway space, as opposed to the gene expression space. We validated the utility of unCTC on scRNA-seq profiles of breast CTCs from six patients, captured and profiled using an integrated ClearCell FX and Polaris workflow that works by the principles of size-based separation of CTCs and marker-based WBC depletion.

Assuntos

Células Neoplásicas Circulantes , Humanos , Células Neoplásicas Circulantes/metabolismo , Transcriptoma , Variações do Número de Cópias de DNA , Perfilação da Expressão Gênica , Biomarcadores Tumorais

2.

Literature mining discerns latent disease-gene relationships.

Rai, Priyadarshini; Jain, Atishay; Kumar, Shivani; Sharma, Divya; Jha, Neha; Chawla, Smriti; Raj, Abhijit; Gupta, Apoorva; Poonia, Sarita; Majumdar, Angshul; Chakraborty, Tanmoy; Ahuja, Gaurav; Sengupta, Debarka.

Bioinformatics ; 40(4)2024 Mar 29.

Artigo em Inglês | MEDLINE | ID: mdl-38608194

RESUMO

MOTIVATION: Dysregulation of a gene's function, either due to mutations or impairments in regulatory networks, often triggers pathological states in the affected tissue. Comprehensive mapping of these apparent gene-pathology relationships is an ever-daunting task, primarily due to genetic pleiotropy and lack of suitable computational approaches. With the advent of high throughput genomics platforms and community scale initiatives such as the Human Cell Landscape project, researchers have been able to create gene expression portraits of healthy tissues resolved at the level of single cells. However, a similar wealth of knowledge is currently not at our finger-tip when it comes to diseases. This is because the genetic manifestation of a disease is often quite diverse and is confounded by several clinical and demographic covariates. RESULTS: To circumvent this, we mined â¼18 million PubMed abstracts published till May 2019 and automatically selected â¼4.5 million of them that describe roles of particular genes in disease pathogenesis. Further, we fine-tuned the pretrained bidirectional encoder representations from transformers (BERT) for language modeling from the domain of natural language processing to learn vector representation of entities such as genes, diseases, tissues, cell-types, etc., in a way such that their relationship is preserved in a vector space. The repurposed BERT predicted disease-gene associations that are not cited in the training data, thereby highlighting the feasibility of in silico synthesis of hypotheses linking different biological entities such as genes and conditions. AVAILABILITY AND IMPLEMENTATION: PathoBERT pretrained model: https://github.com/Priyadarshini-Rai/Pathomap-Model. BioSentVec-based abstract classification model: https://github.com/Priyadarshini-Rai/Pathomap-Model. Pathomap R package: https://github.com/Priyadarshini-Rai/Pathomap.

Assuntos

Mineração de Dados , Humanos , Mineração de Dados/métodos , Biologia Computacional/métodos , Processamento de Linguagem Natural

3.

Compressed sensing CPMG with group-sparse reconstruction for myelin water imaging.

Chen, Henry S; Majumdar, Angshul; Kozlowski, Piotr.

Magn Reson Med ; 71(3): 1166-71, 2014 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-23776079

RESUMO

PURPOSE: Myelin content is a marker for nervous system pathology and is quantifiable by myelin water imaging using multi-echo CPMG sequence, which is inherently slow. One way to accelerate the scan is to utilize compressed sensing. However, reconstructing the images piecemeal by standard compressed sensing methods is not the optimal solution, because it only exploits intraimage spatial redundancy. It does not recognize that the different T2 weighted images are scans of the same anatomical volume and hence correlated. The purpose of this work is to test the feasibility of compressed sensed CPMG with group-sparsity promoting optimization for myelin water imaging. METHODS: Group-sparse reconstruction was performed at various simulated and actual undersampling factors for an electronic phantom, ex vivo rat spinal cord, and in vivo rat spinal cord. Normalized mean square error was used as the metric for comparison. RESULTS: For both simulated undersampling and the actual undersampling, the method was found to minimally impact myelin water fraction map quality (normalized mean square error < 0.25) when acceleration factor was below two. CONCLUSION: Compressed sensed CPMG with group-sparse reconstruction is useful for achieving a shorter scan time than traditionally possible.

Assuntos

Água Corporal/química , Compressão de Dados/métodos , Espectroscopia de Ressonância Magnética/métodos , Imagem Molecular/métodos , Bainha de Mielina/química , Medula Espinal/química , Algoritmos , Humanos , Interpretação de Imagem Assistida por Computador/métodos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade

4.

A low-rank matrix recovery approach for energy efficient EEG acquisition for a wireless body area network.

Majumdar, Angshul; Gogna, Anupriya; Ward, Rabab.

Sensors (Basel) ; 14(9): 15729-48, 2014 Aug 25.

Artigo em Inglês | MEDLINE | ID: mdl-25157551

RESUMO

We address the problem of acquiring and transmitting EEG signals in Wireless Body Area Networks (WBAN) in an energy efficient fashion. In WBANs, the energy is consumed by three operations: sensing (sampling), processing and transmission. Previous studies only addressed the problem of reducing the transmission energy. For the first time, in this work, we propose a technique to reduce sensing and processing energy as well: this is achieved by randomly under-sampling the EEG signal. We depart from previous Compressed Sensing based approaches and formulate signal recovery (from under-sampled measurements) as a matrix completion problem. A new algorithm to solve the matrix completion problem is derived here. We test our proposed method and find that the reconstruction accuracy of our method is significantly better than state-of-the-art techniques; and we achieve this while saving sensing, processing and transmission energy. Simple power analysis shows that our proposed methodology consumes considerably less power compared to previous CS based techniques.

Assuntos

Algoritmos , Redes de Comunicação de Computadores/instrumentação , Compressão de Dados/métodos , Fontes de Energia Elétrica , Eletroencefalografia/instrumentação , Monitorização Ambulatorial/instrumentação , Tecnologia sem Fio/instrumentação , Eletroencefalografia/métodos , Transferência de Energia , Desenho de Equipamento , Análise de Falha de Equipamento

5.

Rank awareness in group-sparse recovery of multi-echo MR images.

Majumdar, Angshul; Ward, Rabab.

Sensors (Basel) ; 13(3): 3902-21, 2013 Mar 20.

Artigo em Inglês | MEDLINE | ID: mdl-23519348

RESUMO

This work addresses the problem of recovering multi-echo T1 or T2 weighted images from their partial K-space scans. Recent studies have shown that the best results are obtained when all the multi-echo images are reconstructed by simultaneously exploiting their intra-image spatial redundancy and inter-echo correlation. The aforesaid studies either stack the vectorised images (formed by row or columns concatenation) as columns of a Multiple Measurement Vector (MMV) matrix or concatenate them as a long vector. Owing to the inter-image correlation, the thus formed MMV matrix or the long concatenated vector is row-sparse or group-sparse respectively in a transform domain (wavelets). Consequently the reconstruction problem was formulated as a row-sparse MMV recovery or a group-sparse vector recovery. In this work we show that when the multi-echo images are arranged in the MMV form, the thus formed matrix is low-rank. We show that better reconstruction accuracy can be obtained when the information about rank-deficiency is incorporated into the row/group sparse recovery problem. Mathematically, this leads to a constrained optimization problem where the objective function promotes the signal's groups-sparsity as well as its rank-deficiency; the objective function is minimized subject to data fidelity constraints. The experiments were carried out on ex vivo and in vivo T2 weighted images of a rat's spinal cord. Results show that this method yields considerably superior results than state-of-the-art reconstruction techniques.

Assuntos

Diagnóstico por Imagem , Processamento de Imagem Assistida por Computador , Imageamento por Ressonância Magnética , Modelos Teóricos , Algoritmos , Animais , Encéfalo/diagnóstico por imagem , Humanos , Aumento da Imagem , Radiografia , Ratos , Ratos Sprague-Dawley

6.

Calibrationless parallel magnetic resonance imaging: a joint sparsity model.

Majumdar, Angshul; Chaudhury, Kunal Narayan; Ward, Rabab.

Sensors (Basel) ; 13(12): 16714-35, 2013 Dec 05.

Artigo em Inglês | MEDLINE | ID: mdl-24316569

RESUMO

State-of-the-art parallel MRI techniques either explicitly or implicitly require certain parameters to be estimated, e.g., the sensitivity map for SENSE, SMASH and interpolation weights for GRAPPA, SPIRiT. Thus all these techniques are sensitive to the calibration (parameter estimation) stage. In this work, we have proposed a parallel MRI technique that does not require any calibration but yields reconstruction results that are at par with (or even better than) state-of-the-art methods in parallel MRI. Our proposed method required solving non-convex analysis and synthesis prior joint-sparsity problems. This work also derives the algorithms for solving them. Experimental validation was carried out on two datasets-eight channel brain and eight channel Shepp-Logan phantom. Two sampling methods were used-Variable Density Random sampling and non-Cartesian Radial sampling. For the brain data, acceleration factor of 4 was used and for the other an acceleration factor of 6 was used. The reconstruction results were quantitatively evaluated based on the Normalised Mean Squared Error between the reconstructed image and the originals. The qualitative evaluation was based on the actual reconstructed images. We compared our work with four state-of-the-art parallel imaging techniques; two calibrated methods-CS SENSE and l1SPIRiT and two calibration free techniques-Distributed CS and SAKE. Our method yields better reconstruction results than all of them.

Assuntos

Processamento de Imagem Assistida por Computador/métodos , Imageamento por Ressonância Magnética/métodos , Encéfalo/fisiologia , Calibragem , Modelos Teóricos

7.

Graph Regularized Probabilistic Matrix Factorization for Drug-Drug Interactions Prediction.

Jain, Stuti; Chouzenoux, Emilie; Kumar, Kriti; Majumdar, Angshul.

IEEE J Biomed Health Inform ; 27(5): 2565-2574, 2023 05.

Artigo em Inglês | MEDLINE | ID: mdl-37027562

RESUMO

Co-administration of two or more drugs simultaneously can result in adverse drug reactions. Identifying drug-drug interactions (DDIs) is necessary, especially for drug development and for repurposing old drugs. DDI prediction can be viewed as a matrix completion task, for which matrix factorization (MF) appears as a suitable solution. This paper presents a novel Graph Regularized Probabilistic Matrix Factorization (GRPMF) method, which incorporates expert knowledge through a novel graph-based regularization strategy within an MF framework. An efficient and sounded optimization algorithm is proposed to solve the resulting non-convex problem in an alternating fashion. The performance of the proposed method is evaluated through the DrugBank dataset, and comparisons are provided against state-of-the-art techniques. The results demonstrate the superior performance of GRPMF when compared to its counterparts.

Assuntos

Algoritmos , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Humanos , Interações Medicamentosas , Preparações Farmacêuticas

8.

DeepVir: Graphical Deep Matrix Factorization for In Silico Antiviral Repositioning-Application to COVID-19.

Mongia, Aanchal; Jain, Stuti; Chouzenoux, Emilie; Majumdar, Angshul.

J Comput Biol ; 29(5): 441-452, 2022 05.

Artigo em Inglês | MEDLINE | ID: mdl-35394368

RESUMO

This study formulates antiviral repositioning as a matrix completion problem wherein the antiviral drugs are along the rows and the viruses are along the columns. The input matrix is partially filled, with ones in positions where the antiviral drug has been known to be effective against a virus. The curated metadata for antivirals (chemical structure and pathways) and viruses (genomic structure and symptoms) are encoded into our matrix completion framework as graph Laplacian regularization. We then frame the resulting multiple graph regularized matrix completion (GRMC) problem as deep matrix factorization. This is solved by using a novel optimization method called HyPALM (Hybrid Proximal Alternating Linearized Minimization). Results of our curated RNA drug-virus association data set show that the proposed approach excels over state-of-the-art GRMC techniques. When applied to in silico prediction of antivirals for COVID-19, our approach returns antivirals that are either used for treating patients or are under trials for the same.

Assuntos

Tratamento Farmacológico da COVID-19 , Algoritmos , Antivirais/farmacologia , Antivirais/uso terapêutico , Humanos

9.

SelfE: Gene Selection via Self-Expression for Single-Cell Data.

Rai, Priyadarshini; Sengupta, Debarka; Majumdar, Angshul.

IEEE/ACM Trans Comput Biol Bioinform ; 19(1): 624-632, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-32750851

RESUMO

Single-cell RNA sequencing has been proved to be advantageous in discerning molecular heterogeneity in seemingly similar cells in a tissue. Due to the paucity of starting RNA, a large fraction of transcripts fail to amplify during the polymerase chain reaction cycle. This gets compounded by trivial biological noise such as variability in the cell cycle specific genes. As a result expression matrix obtained from a single-cell study is highly sparse with a large number of missing values. This hinders downstream analysis of single-cell expression data. It has been observed that feature engineering significantly improves the analysis outcomes. Feature extraction methods such as principal component analysis and zero-inflated factor analysis have been shown to be useful for subsequent steps of data analysis including clustering. However, too little or no visible efforts have been observed for developing feature selection techniques, which offer transparency for the analyst's consumption. We propose SelfE, a novel l2,0 -minimization algorithm that determines an optimal subset of feature vectors that preserves sub-space structures as observed in the data. We compared SelfE with the commonly used feature selection methods for single-cell expression data analysis.

Assuntos

Perfilação da Expressão Gênica , Análise de Célula Única , Algoritmos , Análise por Conglomerados , Análise de Sequência de RNA

10.

Computational Prediction of Drug-Disease Association Based on Graph-Regularized One Bit Matrix Completion.

Mongia, Aanchal; Chouzenoux, Emilie; Majumdar, Angshul.

IEEE/ACM Trans Comput Biol Bioinform ; 19(6): 3332-3339, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-35816539

RESUMO

Investigation of existing drugs is an effective alternative to the discovery of new drugs for treating diseases. This task of drug re-positioning can be assisted by various kinds of computational methods to predict the best indication for a drug given the open-source biological datasets. Owing to the fact that similar drugs tend to have common pathways and disease indications, the association matrix is assumed to be of low-rank structure. Hence, the problem of drug-disease association prediction can be modeled as a low-rank matrix completion problem. In this work, we propose a novel matrix completion framework that makes use of the side-information associated with drugs/diseases for the prediction of drug-disease indications modeled as neighborhood graph: Graph regularized 1-bit matrix completion (GR1BMC). The algorithm is specially designed for binary data and uses parallel proximal algorithm to solve the aforesaid minimization problem taking into account all the constraints including the neighborhood graph incorporation and restricting predicted scores within the specified range. The results have been validated on two standard databases by evaluating the AUC across the 10-fold cross-validation splits. The usage of the method is also evaluated through a case study where top 5 indications are predicted for novel drugs, which then are verified with the CTD database.

Assuntos

Algoritmos , Biologia Computacional , Biologia Computacional/métodos , Projetos de Pesquisa , Bases de Dados Factuais , Gerenciamento de Dados

11.

Gene expression based inference of cancer drug sensitivity.

Chawla, Smriti; Rockstroh, Anja; Lehman, Melanie; Ratther, Ellca; Jain, Atishay; Anand, Anuneet; Gupta, Apoorva; Bhattacharya, Namrata; Poonia, Sarita; Rai, Priyadarshini; Das, Nirjhar; Majumdar, Angshul; Ahuja, Gaurav; Hollier, Brett G; Nelson, Colleen C; Sengupta, Debarka.

Nat Commun ; 13(1): 5680, 2022 09 27.

Artigo em Inglês | MEDLINE | ID: mdl-36167836

RESUMO

Inter and intra-tumoral heterogeneity are major stumbling blocks in the treatment of cancer and are responsible for imparting differential drug responses in cancer patients. Recently, the availability of high-throughput screening datasets has paved the way for machine learning based personalized therapy recommendations using the molecular profiles of cancer specimens. In this study, we introduce Precily, a predictive modeling approach to infer treatment response in cancers using gene expression data. In this context, we demonstrate the benefits of considering pathway activity estimates in tandem with drug descriptors as features. We apply Precily on single-cell and bulk RNA sequencing data associated with hundreds of cancer cell lines. We then assess the predictability of treatment outcomes using our in-house prostate cancer cell line and xenografts datasets exposed to differential treatment conditions. Further, we demonstrate the applicability of our approach on patient drug response data from The Cancer Genome Atlas and an independent clinical study describing the treatment journey of three melanoma patients. Our findings highlight the importance of chemo-transcriptomics approaches in cancer treatment selection.

Assuntos

Antineoplásicos , Melanoma , Antineoplásicos/farmacologia , Antineoplásicos/uso terapêutico , Expressão Gênica , Humanos , Aprendizado de Máquina , Masculino , Melanoma/tratamento farmacológico , Melanoma/genética , Análise de Sequência de RNA

12.

A computational approach to aid clinicians in selecting anti-viral drugs for COVID-19 trials.

Mongia, Aanchal; Saha, Sanjay Kr; Chouzenoux, Emilie; Majumdar, Angshul.

Sci Rep ; 11(1): 9047, 2021 04 27.

Artigo em Inglês | MEDLINE | ID: mdl-33907209

RESUMO

The year 2020 witnessed a heavy death toll due to COVID-19, calling for a global emergency. The continuous ongoing research and clinical trials paved the way for vaccines. But, the vaccine efficacy in the long run is still questionable due to the mutating coronavirus, which makes drug re-positioning a reasonable alternative. COVID-19 has hence fast-paced drug re-positioning for the treatment of COVID-19 and its symptoms. This work builds computational models using matrix completion techniques to predict drug-virus association for drug re-positioning. The aim is to assist clinicians with a tool for selecting prospective antiviral treatments. Since the virus is known to mutate fast, the tool is likely to help clinicians in selecting the right set of antivirals for the mutated isolate. The main contribution of this work is a manually curated database publicly shared, comprising of existing associations between viruses and their corresponding antivirals. The database gathers similarity information using the chemical structure of drugs and the genomic structure of viruses. Along with this database, we make available a set of state-of-the-art computational drug re-positioning tools based on matrix completion. The tools are first analysed on a standard set of experimental protocols for drug target interactions. The best performing ones are applied for the task of re-positioning antivirals for COVID-19. These tools select six drugs out of which four are currently under various stages of trial, namely Remdesivir (as a cure), Ribavarin (in combination with others for cure), Umifenovir (as a prophylactic and cure) and Sofosbuvir (as a cure). Another unanimous prediction is Tenofovir alafenamide, which is a novel Tenofovir prodrug developed in order to improve renal safety when compared to its original counterpart (older version) Tenofovir disoproxil. Both are under trail, the former as a cure and the latter as a prophylactic. These results establish that the computational methods are in sync with the state-of-practice. We also demonstrate how the drugs to be used against the virus would vary as SARS-Cov-2 mutates over time by predicting the drugs for the mutated strains, suggesting the importance of such a tool in drug prediction. We believe this work would open up possibilities for applying machine learning models to clinical research for drug-virus association prediction and other similar biological problems.

Assuntos

Antivirais/uso terapêutico , Tratamento Farmacológico da COVID-19 , Algoritmos , Área Sob a Curva , COVID-19/virologia , Bases de Dados Factuais , Reposicionamento de Medicamentos , Evolução Molecular , Humanos , Mutação , Curva ROC , SARS-CoV-2/genética , SARS-CoV-2/isolamento & purificação

13.

Resource Constrained CVD Classification Using Single Lead ECG On Wearable and Implantable Devices.

Ukil, Arijit; Sahu, Ishan; Majumdar, Angshul; Racha, Sai Chander; Kulkarni, Gitesh; Choudhury, Anirban Dutta; Khandelwal, Sundeep; Ghose, Avik; Pal, Arpan.

Annu Int Conf IEEE Eng Med Biol Soc ; 2021: 886-889, 2021 11.

Artigo em Inglês | MEDLINE | ID: mdl-34891432

RESUMO

Electrocardiogram (ECG) is one of the fundamental markers to detect different cardiovascular diseases (CVDs). Owing to the widespread availability of ECG sensors (single lead) as well as smartwatches with ECG recording capability, ECG classification using wearable devices to detect different CVDs has become a basic requirement for a smart healthcare ecosystem. In this paper, we propose a novel method of model compression with robust detection capability for CVDs from ECG signals such that the sophisticated and effective baseline deep neural network model can be optimized for the resource constrained micro-controller platform suitable for wearable devices while minimizing the performance loss. We employ knowledge distillation-based model compression approach where the baseline (teacher) deep neural network model is compressed to a TinyML (student) model using piecewise linear approximation. Our proposed ECG TinyML has achieved ~156x compression factor to suit to the requirement of 100KB memory availability for model deployment on wearable devices. The proposed model requires ~5782 times (estimated) less computational load than state-of-the-art residual neural network (ResNet) model with negligible performance loss (less than 1% loss in test accuracy, test sensitivity, test precision and test F1-score). We further feel that the small footprint model size of ECG TinyML (62.3 KB) can be suitably deployed in implantable devices including implantable loop recorder (ILR).

Assuntos

Doenças Cardiovasculares , Compressão de Dados , Dispositivos Eletrônicos Vestíveis , Ecossistema , Eletrocardiografia , Humanos

14.

Graph transform learning.

Majumdar, Angshul.

Neural Netw ; 128: 248-253, 2020 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-32454369

RESUMO

Transform learning is a new representation learning framework where we learn an operator/transform that analyses the data to generate the coefficient/representation. We propose a variant of it called the graph transform learning; in this we explicitly account for the correlation in the dataset in terms of graph Laplacian. We will give two variants; in the first one the graph is computed from the data and fixed during the operation. In the second, the graph is learnt iteratively from the data during operation. The first technique will be applied for clustering, and the second one for solving inverse problems.

Assuntos

Imageamento por Ressonância Magnética/métodos , Aprendizado de Máquina não Supervisionado , Algoritmos , Análise por Conglomerados , Humanos , Imageamento por Ressonância Magnética/tendências , Resolução de Problemas , Aprendizado de Máquina não Supervisionado/tendências

15.

Drug-target interaction prediction using Multi Graph Regularized Nuclear Norm Minimization.

Mongia, Aanchal; Majumdar, Angshul.

PLoS One ; 15(1): e0226484, 2020.

Artigo em Inglês | MEDLINE | ID: mdl-31945078

RESUMO

The identification of potential interactions between drugs and target proteins is crucial in pharmaceutical sciences. The experimental validation of interactions in genomic drug discovery is laborious and expensive; hence, there is a need for efficient and accurate in-silico techniques which can predict potential drug-target interactions to narrow down the search space for experimental verification. In this work, we propose a new framework, namely, Multi-Graph Regularized Nuclear Norm Minimization, which predicts the interactions between drugs and target proteins from three inputs: known drug-target interaction network, similarities over drugs and those over targets. The proposed method focuses on finding a low-rank interaction matrix that is structured by the proximities of drugs and targets encoded by graphs. Previous works on Drug Target Interaction (DTI) prediction have shown that incorporating drug and target similarities helps in learning the data manifold better by preserving the local geometries of the original data. But, there is no clear consensus on which kind and what combination of similarities would best assist the prediction task. Hence, we propose to use various multiple drug-drug similarities and target-target similarities as multiple graph Laplacian (over drugs/targets) regularization terms to capture the proximities exhaustively. Extensive cross-validation experiments on four benchmark datasets using standard evaluation metrics (AUPR and AUC) show that the proposed algorithm improves the predictive performance and outperforms recent state-of-the-art computational methods by a large margin. Software is publicly available at https://github.com/aanchalMongia/MGRNNMforDTI.

Assuntos

Algoritmos , Gráficos por Computador , Desenvolvimento de Medicamentos/métodos , Descoberta de Drogas/métodos , Interações Medicamentosas , Preparações Farmacêuticas/metabolismo , Proteínas/metabolismo , Simulação por Computador , Humanos , Preparações Farmacêuticas/química , Proteínas/química

16.

deepMc: Deep Matrix Completion for Imputation of Single-Cell RNA-seq Data.

Mongia, Aanchal; Sengupta, Debarka; Majumdar, Angshul.

J Comput Biol ; 27(7): 1011-1019, 2020 07.

Artigo em Inglês | MEDLINE | ID: mdl-31657645

RESUMO

Single-cell RNA-seq has inspired new discoveries and innovation in the field of developmental and cell biology for the past few years and is useful for studying cellular responses at individual cell resolution. But, due to the paucity of starting RNA, the data acquired have dropouts. To address this, we propose a deep matrix factorization-based method, deepMc, to impute missing values in gene expression data. For the deep architecture of our approach, we draw our motivation from great success of deep learning in solving various machine learning problems. In this study, we support our method with positive results on several evaluation metrics such as clustering of cell populations, differential expression analysis, and cell type separability.

Assuntos

Biologia Computacional/métodos , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Animais , Blastocisto/citologia , Aprendizado Profundo , Células HEK293 , Humanos , Células Jurkat , Camundongos , Análise de Sequência de RNA/estatística & dados numéricos , Análise de Célula Única/estatística & dados numéricos

17.

FITs: forest of imputation trees for recovering true signals in single-cell open chromatin profiles.

Sharma, Rachesh; Pandey, Neetesh; Mongia, Aanchal; Mishra, Shreya; Majumdar, Angshul; Kumar, Vibhor.

NAR Genom Bioinform ; 2(4): lqaa091, 2020 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-33575635

RESUMO

The advent of single-cell open-chromatin profiling technology has facilitated the analysis of heterogeneity of activity of regulatory regions at single-cell resolution. However, stochasticity and availability of low amount of relevant DNA, cause high drop-out rate and noise in single-cell open-chromatin profiles. We introduce here a robust method called as forest of imputation trees (FITs) to recover original signals from highly sparse and noisy single-cell open-chromatin profiles. FITs makes multiple imputation trees to avoid bias during the restoration of read-count matrices. It resolves the challenging issue of recovering open chromatin signals without blurring out information at genomic sites with cell-type-specific activity. Besides visualization and classification, FITs-based imputation also improved accuracy in the detection of enhancers, calculating pathway enrichment score and prediction of chromatin-interactions. FITs is generalized for wider applicability, especially for highly sparse read-count matrices. The superiority of FITs in recovering signals of minority cells also makes it highly useful for single-cell open-chromatin profile from in vivo samples. The software is freely available at https://reggenlab.github.io/FITs/.

18.

Recurrent transform learning.

Majumdar, Angshul; Gupta, Megha.

Neural Netw ; 118: 271-279, 2019 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-31326661

RESUMO

Recurrent neural networks (RNN) model time series by feeding back the representation from the previous time instant as an input for the current instant along with exogenous inputs. Two main shortcomings of RNN are - 1. The problem of vanishing gradients while backpropagating through time, and 2. Inability to learn in an unsupervised manner. Variants like long-short term memory (LSTM) network and gated recurrent units (GRU) have partially circumvented the first issue; the second issue still remains. In this work we propose a new variant of RNN based on the transform learning model - named recurrent transform learning (RTL). It can learn in an unsupervised, supervised and semi-supervised fashion; it does not require backpropagation and hence do not suffer from the pitfalls of vanishing gradients. The proposed model is applied on a real-life example of short-term load forecasting, where we show that RTL improves over existing variants of RNN as well as on a state-of-the-art technique in load forecasting based on sparse coding.

Assuntos

Aprendizado de Máquina , Redes Neurais de Computação , Previsões

19.

McImpute: Matrix Completion Based Imputation for Single Cell RNA-seq Data.

Mongia, Aanchal; Sengupta, Debarka; Majumdar, Angshul.

Front Genet ; 10: 9, 2019.

Artigo em Inglês | MEDLINE | ID: mdl-30761179

RESUMO

Motivation: Single-cell RNA sequencing has been proved to be revolutionary for its potential of zooming into complex biological systems. Genome-wide expression analysis at single-cell resolution provides a window into dynamics of cellular phenotypes. This facilitates the characterization of transcriptional heterogeneity in normal and diseased tissues under various conditions. It also sheds light on the development or emergence of specific cell populations and phenotypes. However, owing to the paucity of input RNA, a typical single cell RNA sequencing data features a high number of dropout events where transcripts fail to get amplified. Results: We introduce mcImpute, a low-rank matrix completion based technique to impute dropouts in single cell expression data. On a number of real datasets, application of mcImpute yields significant improvements in the separation of true zeros from dropouts, cell-clustering, differential expression analysis, cell type separability, the performance of dimensionality reduction techniques for cell visualization, and gene distribution. Availability and Implementation: https://github.com/aanchalMongia/McImpute_scRNAseq.

20.

Blind Denoising Autoencoder.

Majumdar, Angshul.

IEEE Trans Neural Netw Learn Syst ; 2018 Jun 12.

Artigo em Inglês | MEDLINE | ID: mdl-29994276

RESUMO

The term ``blind denoising'' refers to the fact that the basis used for denoising is learned from the noisy sample itself during denoising. Dictionary learning- and transform learning-based formulations for blind denoising are well known. But there has been no autoencoder-based solution for the said blind denoising approach. So far, autoencoder-based denoising formulations have learned the model on a separate training data and have used the learned model to denoise test samples. Such a methodology fails when the test image (to denoise) is not of the same kind as the models learned with. This will be the first work, where we learn the autoencoder from the noisy sample while denoising. Experimental results show that our proposed method performs better than dictionary learning (K-singular value decomposition), transform learning, sparse stacked denoising autoencoder, and the gold standard BM3D algorithm.

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

Detalhe da pesquisa