Pesquisa | Biblioteca Virtual em Saúde

NoVaTeST: identifying genes with location-dependent noise variance in spatial transcriptomics data.

Abrar, Mohammed Abid; Kaykobad, M; Rahman, M Saifur; Samee, Md Abul Hassan.

Bioinformatics ; 39(6)2023 06 01.

Artigo em Inglês | MEDLINE | ID: mdl-37285319

RESUMO

MOTIVATION: Spatial transcriptomics (ST) can reveal the existence and extent of spatial variation of gene expression in complex tissues. Such analyses could help identify spatially localized processes underlying a tissue's function. Existing tools to detect spatially variable genes assume a constant noise variance across spatial locations. This assumption might miss important biological signals when the variance can change across locations. RESULTS: In this article, we propose NoVaTeST, a framework to identify genes with location-dependent noise variance in ST data. NoVaTeST models gene expression as a function of spatial location and allows the noise to vary spatially. NoVaTeST then statistically compares this model to one with constant noise and detects genes showing significant spatial noise variation. We refer to these genes as "noisy genes." In tumor samples, the noisy genes detected by NoVaTeST are largely independent of the spatially variable genes detected by existing tools that assume constant noise, and provide important biological insights into tumor microenvironments. AVAILABILITY AND IMPLEMENTATION: An implementation of the NoVaTeST framework in Python along with instructions for running the pipeline is available at https://github.com/abidabrar-bracu/NoVaTeST.

Assuntos

Software , Transcriptoma , Perfilação da Expressão Gênica

Antigenic: An improved prediction model of protective antigens.

Rahman, M Saifur; Rahman, Md Khaledur; Saha, Sanjay; Kaykobad, M; Rahman, M Sohel.

Artif Intell Med ; 94: 28-41, 2019 03.

Artigo em Inglês | MEDLINE | ID: mdl-30871681

RESUMO

An antigen is a protein capable of triggering an effective immune system response. Protective antigens are the ones that can invoke specific and enhanced adaptive immune response to subsequent exposure to the specific pathogen or related organisms. Such proteins are therefore of immense importance in vaccine preparation and drug design. However, the laboratory experiments to isolate and identify antigens from a microbial pathogen are expensive, time consuming and often unsuccessful. This is why Reverse Vaccinology has become the modern trend of vaccine search, where computational methods are first applied to predict protective antigens or their determinants, known as epitopes. In this paper, we propose a novel, accurate computational model to identify protective antigens efficiently. Our model extracts features directly from the protein sequences, without any dependence on functional domain or structural information. After relevant features are extracted, we have used Random Forest algorithm to rank the features. Then Recursive Feature Elimination (RFE) and minimum redundancy maximum relevance (mRMR) criterion were applied to extract an optimal set of features. The learning model was trained using Random Forest algorithm. Named as Antigenic, our proposed model demonstrates superior performance compared to the state-of-the-art predictors on a benchmark dataset. Antigenic achieves accuracy, sensitivity and specificity values of 78.04%, 78.99% and 77.08% in 10-fold cross-validation testing respectively. In jackknife cross-validation, the corresponding scores are 80.03%, 80.90% and 79.16% respectively. The source code of Antigenic, along with relevant dataset and detailed experimental results, can be found at https://github.com/srautonu/AntigenPredictor. A publicly accessible web interface has also been established at: http://antigenic.research.buet.ac.bd.

Assuntos

Antígenos/análise , Modelos Biológicos , Algoritmos , Aminoácidos/análise , Antígenos/química , Biologia Computacional/métodos

DPP-PseAAC: A DNA-binding protein prediction model using Chou's general PseAAC.

Rahman, M Saifur; Shatabda, Swakkhar; Saha, Sanjay; Kaykobad, M; Rahman, M Sohel.

J Theor Biol ; 452: 22-34, 2018 09 07.

Artigo em Inglês | MEDLINE | ID: mdl-29753757

RESUMO

A DNA-binding protein (DNA-BP) is a protein that can bind and interact with a DNA. Identification of DNA-BPs using experimental methods is expensive as well as time consuming. As such, fast and accurate computational methods are sought for predicting whether a protein can bind with a DNA or not. In this paper, we focus on building a new computational model to identify DNA-BPs in an efficient and accurate way. Our model extracts meaningful information directly from the protein sequences, without any dependence on functional domain or structural information. After feature extraction, we have employed Random Forest (RF) model to rank the features. Afterwards, we have used Recursive Feature Elimination (RFE) method to extract an optimal set of features and trained a prediction model using Support Vector Machine (SVM) with linear kernel. Our proposed method, named as DNA-binding Protein Prediction model using Chou's general PseAAC (DPP-PseAAC), demonstrates superior performance compared to the state-of-the-art predictors on standard benchmark dataset. DPP-PseAAC achieves accuracy values of 93.21%, 95.91% and 77.42% for 10-fold cross-validation test, jackknife test and independent test respectively. The source code of DPP-PseAAC, along with relevant dataset and detailed experimental results, can be found at https://github.com/srautonu/DNABinding. A publicly accessible web interface has also been established at: http://77.68.43.135:8080/DPP-PseAAC/.

Assuntos

Algoritmos , Biologia Computacional/métodos , Proteínas de Ligação a DNA/metabolismo , Máquina de Vetores de Suporte , Sequência de Aminoácidos , Aminoácidos/química , Aminoácidos/genética , Aminoácidos/metabolismo , DNA/química , DNA/genética , DNA/metabolismo , Proteínas de Ligação a DNA/química , Proteínas de Ligação a DNA/genética , Bases de Dados de Proteínas , Modelos Moleculares , Conformação de Ácido Nucleico , Domínios Proteicos , Reprodutibilidade dos Testes

isGPT: An optimized model to identify sub-Golgi protein types using SVM and Random Forest based feature selection.

Rahman, M Saifur; Rahman, Md Khaledur; Kaykobad, M; Rahman, M Sohel.

Artif Intell Med ; 84: 90-100, 2018 01.

Artigo em Inglês | MEDLINE | ID: mdl-29183738

RESUMO

The Golgi Apparatus (GA) is a key organelle for protein synthesis within the eukaryotic cell. The main task of GA is to modify and sort proteins for transport throughout the cell. Proteins permeate through the GA on the ER (Endoplasmic Reticulum) facing side (cis side) and depart on the other side (trans side). Based on this phenomenon, we get two types of GA proteins, namely, cis-Golgi protein and trans-Golgi protein. Any dysfunction of GA proteins can result in congenital glycosylation disorders and some other forms of difficulties that may lead to neurodegenerative and inherited diseases like diabetes, cancer and cystic fibrosis. So, the exact classification of GA proteins may contribute to drug development which will further help in medication. In this paper, we focus on building a new computational model that not only introduces easy ways to extract features from protein sequences but also optimizes classification of trans-Golgi and cis-Golgi proteins. After feature extraction, we have employed Random Forest (RF) model to rank the features based on the importance score obtained from it. After selecting the top ranked features, we have applied Support Vector Machine (SVM) to classify the sub-Golgi proteins. We have trained regression model as well as classification model and found the former to be superior. The model shows improved performance over all previous methods. As the benchmark dataset is significantly imbalanced, we have applied Synthetic Minority Over-sampling Technique (SMOTE) to the dataset to make it balanced and have conducted experiments on both versions. Our method, namely, identification of sub-Golgi Protein Types (isGPT), achieves accuracy values of 95.4%, 95.9% and 95.3% for 10-fold cross-validation test, jackknife test and independent test respectively. According to different performance metrics, isGPT performs better than state-of-the-art techniques. The source code of isGPT, along with relevant dataset and detailed experimental results, can be found at https://github.com/srautonu/isGPT.

Assuntos

Biologia Computacional/métodos , Complexo de Golgi/química , Oligopeptídeos/análise , Proteínas/análise , Máquina de Vetores de Suporte , Sequência de Aminoácidos , Animais , Bases de Dados de Proteínas , Humanos , Oligopeptídeos/classificação , Proteínas/classificação , Reprodutibilidade dos Testes

New sufficient conditions for Hamiltonian paths.

Rahman, M Sohel; Kaykobad, M; Firoz, Jesun Sahariar.

ScientificWorldJournal ; 2014: 743431, 2014.

Artigo em Inglês | MEDLINE | ID: mdl-25045745

RESUMO

A Hamiltonian path in a graph is a path involving all the vertices of the graph. In this paper, we revisit the famous Hamiltonian path problem and present new sufficient conditions for the existence of a Hamiltonian path in a graph.

Variational Monte Carlo calculations for the binding energy of Lambda Lambda 31Si.

Ahsan MH; Kaykobad M; Ali S.

Phys Rev C Nucl Phys ; 43(1): 146-151, 1991 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-9967054

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA