Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 81
Filtrar
1.
Genome Res ; 31(4): 689-697, 2021 04.
Artigo em Inglês | MEDLINE | ID: mdl-33674351

RESUMO

Systematic delineation of complex biological systems is an ever-challenging and resource-intensive process. Single-cell transcriptomics allows us to study cell-to-cell variability in complex tissues at an unprecedented resolution. Accurate modeling of gene expression plays a critical role in the statistical determination of tissue-specific gene expression patterns. In the past few years, considerable efforts have been made to identify appropriate parametric models for single-cell expression data. The zero-inflated version of Poisson/negative binomial and log-normal distributions have emerged as the most popular alternatives owing to their ability to accommodate high dropout rates, as commonly observed in single-cell data. Although the majority of the parametric approaches directly model expression estimates, we explore the potential of modeling expression ranks, as robust surrogates for transcript abundance. Here we examined the performance of the discrete generalized beta distribution (DGBD) on real data and devised a Wald-type test for comparing gene expression across two phenotypically divergent groups of single cells. We performed a comprehensive assessment of the proposed method to understand its advantages compared with some of the existing best-practice approaches. We concluded that besides striking a reasonable balance between Type I and Type II errors, ROSeq, the proposed differential expression test, is exceptionally robust to expression noise and scales rapidly with increasing sample size. For wider dissemination and adoption of the method, we created an R package called ROSeq and made it available on the Bioconductor platform.


Assuntos
Perfilação da Expressão Gênica , RNA-Seq , Análise de Célula Única , Transcriptoma
2.
Brief Bioinform ; 23(1)2022 01 17.
Artigo em Inglês | MEDLINE | ID: mdl-34849560

RESUMO

Prostate cancer is the second leading cause of cancer-related death in men. Metastasis shows poor survival even though the recovery rate is high. In spite of numerous studies regarding prostate carcinoma, multiple questions are still unanswered. In this regards, gene regulatory network can uncover the mechanisms behind cancer progression, and metastasis. Under a feed forward loop, transcription factors (TFs) can be a good druggable candidate. We have proposed a computational model to study the uncertainty of TFs and suggest the appropriate cellular conditions for drug targeting. We have selected feed-forward loops depending on the shared list of the functional annotations among TFs, genes and miRNAs. From the potential feed forward loop cores, six TFs were identified as druggable targets, which include AR, CEBPB, CREB1, ETS1, NFKB1 and RELA. However, TFs are known for their Protein Moonlighting properties, which provide unrelated multi-functionalities within the same or different subcellular localizations. Following that, we have identified such functions that are suitable for drug targeting. On the other hand, we have tried to identify membraneless organelles for providing more specificity to the proposed time and space theory. The study has provided certain possibilities on TF-based therapeutics. The controlled dynamic nature of the TF may have enhanced the chances where TFs can be considered as one of the prime drug targets. Finally, the combination of membranless phase separation and protein moonlighting has provided possible druggable period within the biological clock.


Assuntos
Redes Reguladoras de Genes , Neoplasias da Próstata , Fatores de Transcrição , Regulação da Expressão Gênica , Redes Reguladoras de Genes/genética , Humanos , Masculino , MicroRNAs/genética , MicroRNAs/metabolismo , Neoplasias da Próstata/tratamento farmacológico , Neoplasias da Próstata/genética , Fatores de Transcrição/efeitos dos fármacos , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo
3.
Brief Bioinform ; 22(2): 914-923, 2021 03 22.
Artigo em Inglês | MEDLINE | ID: mdl-32968798

RESUMO

The novel coronavirus or COVID-19 has first been found in Wuhan, China, and became pandemic. Angiotensin-converting enzyme 2 (ACE2) plays a key role in the host cells as a receptor of Spike-I Glycoprotein of COVID-19 which causes final infection. ACE2 is highly expressed in the bladder, ileum, kidney and liver, comparing with ACE2 expression in the lung-specific pulmonary alveolar type II cells. In this study, the single-cell RNAseq data of the five tissues from different humans are curated and cell types with high expressions of ACE2 are identified. Subsequently, the protein-protein interaction networks have been established. From the network, potential biomarkers which can form functional hubs, are selected based on k-means network clustering. It is observed that angiotensin PPAR family proteins show important roles in the functional hubs. To understand the functions of the potential markers, corresponding pathways have been researched thoroughly through the pathway semantic networks. Subsequently, the pathways have been ranked according to their influence and dependency in the network using PageRank algorithm. The outcomes show some important facts in terms of infection. Firstly, renin-angiotensin system and PPAR signaling pathway can play a vital role for enhancing the infection after its intrusion through ACE2. Next, pathway networks consist of few basic metabolic and influential pathways, e.g. insulin resistance. This information corroborate the fact that diabetic patients are more vulnerable to COVID-19 infection. Interestingly, the key regulators of the aforementioned pathways are angiontensin and PPAR family proteins. Hence, angiotensin and PPAR family proteins can be considered as possible therapeutic targets. Contact: sagnik.sen2008@gmail.com, umaulik@cse.jdvu.ac.in Supplementary information: Supplementary data are available online.


Assuntos
COVID-19/metabolismo , SARS-CoV-2/patogenicidade , Algoritmos , Enzima de Conversão de Angiotensina 2/metabolismo , COVID-19/virologia , Humanos , Íleo/metabolismo , Íleo/patologia , Rim/metabolismo , Rim/patologia , Fígado/metabolismo , Fígado/patologia , Receptores Ativados por Proliferador de Peroxissomo/metabolismo , Mapas de Interação de Proteínas , Sistema Renina-Angiotensina/fisiologia , Transdução de Sinais , Glicoproteína da Espícula de Coronavírus/metabolismo , Bexiga Urinária/metabolismo , Bexiga Urinária/patologia
4.
Brief Bioinform ; 22(6)2021 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-34143202

RESUMO

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), a causative agent of the coronavirus disease (COVID-19), is a part of the $\beta $-Coronaviridae family. The virus contains five major protein classes viz., four structural proteins [nucleocapsid (N), membrane (M), envelop (E) and spike glycoprotein (S)] and replicase polyproteins (R), synthesized as two polyproteins (ORF1a and ORF1ab). Due to the severity of the pandemic, most of the SARS-CoV-2-related research are focused on finding therapeutic solutions. However, studies on the sequences and structure space throughout the evolutionary time frame of viral proteins are limited. Besides, the structural malleability of viral proteins can be directly or indirectly associated with the dysfunctionality of the host cell proteins. This dysfunctionality may lead to comorbidities during the infection and may continue at the post-infection stage. In this regard, we conduct the evolutionary sequence-structure analysis of the viral proteins to evaluate their malleability. Subsequently, intrinsic disorder propensities of these viral proteins have been studied to confirm that the short intrinsically disordered regions play an important role in enhancing the likelihood of the host proteins interacting with the viral proteins. These interactions may result in molecular dysfunctionality, finally leading to different diseases. Based on the host cell proteins, the diseases are divided in two distinct classes: (i) proteins, directly associated with the set of diseases while showing similar activities, and (ii) cytokine storm-mediated pro-inflammation (e.g. acute respiratory distress syndrome, malignancies) and neuroinflammation (e.g. neurodegenerative and neuropsychiatric diseases). Finally, the study unveils that males and postmenopausal females can be more vulnerable to SARS-CoV-2 infection due to the androgen-mediated protein transmembrane serine protease 2.


Assuntos
COVID-19/genética , Genoma Viral/genética , Conformação Proteica , SARS-CoV-2/ultraestrutura , COVID-19/virologia , Proteínas do Envelope de Coronavírus/genética , Proteínas do Envelope de Coronavírus/ultraestrutura , Humanos , Proteínas de Membrana/genética , Proteínas de Membrana/ultraestrutura , Proteínas do Nucleocapsídeo/genética , Proteínas do Nucleocapsídeo/ultraestrutura , SARS-CoV-2/genética , SARS-CoV-2/patogenicidade , Glicoproteína da Espícula de Coronavírus/genética , Glicoproteína da Espícula de Coronavírus/ultraestrutura , Proteínas do Complexo da Replicase Viral/genética , Proteínas do Complexo da Replicase Viral/ultraestrutura , Proteínas Estruturais Virais/genética , Proteínas Estruturais Virais/ultraestrutura
5.
Methods ; 203: 108-115, 2022 07.
Artigo em Inglês | MEDLINE | ID: mdl-35364279

RESUMO

The ongoing global pandemic of COVID-19, caused by SARS-CoV-2 has killed more than 5.9 million individuals out of ∼43 million confirmed infections. At present, several parts of the world are encountering the 3rd wave. Mass vaccination has been started in several countries but they are less likely to be broadly available for the current pandemic, repurposing of the existing drugs has drawn highest attention for an immediate solution. A recent publication has mapped the physical interactions of SARS-CoV-2 and human proteins by affinity-purification mass spectrometry (AP-MS) and identified 332 high-confidence SARS-CoV-2-human protein-protein interactions (PPIs). Here, we taken a network biology approach and constructed a human protein-protein interaction network (PPIN) with the above SARS-CoV-2 targeted proteins. We utilized a combination of essential network centrality measures and functional properties of the human proteins to identify the critical human targets of SARS-CoV-2. Four human proteins, namely PRKACA, RHOA, CDK5RAP2, and CEP250 have emerged as the best therapeutic targets, of which PRKACA and CEP250 were also found by another group as potential candidates for drug targets in COVID-19. We further found candidate drugs/compounds, such as guanosine triphosphate, remdesivir, adenosine monophosphate, MgATP, and H-89 dihydrochloride that bind the target human proteins. The urgency to prevent the spread of infection and the death of diseased individuals has prompted the search for agents from the pool of approved drugs to repurpose them for COVID-19. Our results indicate that host targeting therapy with the repurposed drugs may be a useful strategy for the treatment of SARS-CoV-2 infection.


Assuntos
Antivirais , Tratamento Farmacológico da COVID-19 , Antivirais/farmacologia , Antivirais/uso terapêutico , Autoantígenos , Proteínas de Ciclo Celular , Reposicionamento de Medicamentos , Humanos , Proteínas do Tecido Nervoso , Pandemias , SARS-CoV-2
6.
Cell Mol Life Sci ; 76(20): 4145-4154, 2019 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-31011770

RESUMO

A crucial contribution to the heterogeneity of the conformational landscape of a protein comes from the way an intermediate relates to another intermediate state in its journey from the unfolded to folded or misfolded form. Unfortunately, it is extremely hard to decode this relatedness in a quantifiable manner. Here, we developed an application of statistical cluster analyses to explore the conformational heterogeneity of a metalloenzyme, human cytosolic copper-zinc superoxide dismutase (SOD1), using the inputs from infrared spectroscopy. This study provides a quantifiable picture of how conformational information at one particular site (for example, the copper-binding pocket) is related to the information at the second site (for example, the zinc-binding pocket), and how this relatedness is transferred to the global conformational information of the protein. The distance outputs were used to quantitatively generate a network capturing the folding sub-stages of SOD1.


Assuntos
Cobre/química , Agregados Proteicos/genética , Superóxido Dismutase-1/química , Zinco/química , Sítios de Ligação , Clonagem Molecular , Análise por Conglomerados , Cobre/metabolismo , Escherichia coli/genética , Escherichia coli/metabolismo , Expressão Gênica , Vetores Genéticos/química , Vetores Genéticos/metabolismo , Humanos , Modelos Moleculares , Mutagênese Sítio-Dirigida , Ligação Proteica , Conformação Proteica em alfa-Hélice , Conformação Proteica em Folha beta , Dobramento de Proteína , Domínios e Motivos de Interação entre Proteínas , Proteínas Recombinantes/química , Proteínas Recombinantes/genética , Proteínas Recombinantes/metabolismo , Espectroscopia de Infravermelho com Transformada de Fourier , Superóxido Dismutase-1/genética , Superóxido Dismutase-1/metabolismo , Zinco/metabolismo
7.
BMC Bioinformatics ; 20(1): 736, 2019 Dec 27.
Artigo em Inglês | MEDLINE | ID: mdl-31881961

RESUMO

BACKGROUND: With the global spread of multidrug resistance in pathogenic microbes, infectious diseases emerge as a key public health concern of the recent time. Identification of host genes associated with infectious diseases will improve our understanding about the mechanisms behind their development and help to identify novel therapeutic targets. RESULTS: We developed a machine learning techniques-based classification approach to identify infectious disease-associated host genes by integrating sequence and protein interaction network features. Among different methods, Deep Neural Networks (DNN) model with 16 selected features for pseudo-amino acid composition (PAAC) and network properties achieved the highest accuracy of 86.33% with sensitivity of 85.61% and specificity of 86.57%. The DNN classifier also attained an accuracy of 83.33% on a blind dataset and a sensitivity of 83.1% on an independent dataset. Furthermore, to predict unknown infectious disease-associated host genes, we applied the proposed DNN model to all reviewed proteins from the database. Seventy-six out of 100 highly-predicted infectious disease-associated genes from our study were also found in experimentally-verified human-pathogen protein-protein interactions (PPIs). Finally, we validated the highly-predicted infectious disease-associated genes by disease and gene ontology enrichment analysis and found that many of them are shared by one or more of the other diseases, such as cancer, metabolic and immune related diseases. CONCLUSIONS: To the best of our knowledge, this is the first computational method to identify infectious disease-associated host genes. The proposed method will help large-scale prediction of host genes associated with infectious-diseases. However, our results indicated that for small datasets, advanced DNN-based method does not offer significant advantage over the simpler supervised machine learning techniques, such as Support Vector Machine (SVM) or Random Forest (RF) for the prediction of infectious disease-associated host genes. Significant overlap of infectious disease with cancer and metabolic disease on disease and gene ontology enrichment analysis suggests that these diseases perturb the functions of the same cellular signaling pathways and may be treated by drugs that tend to reverse these perturbations. Moreover, identification of novel candidate genes associated with infectious diseases would help us to explain disease pathogenesis further and develop novel therapeutics.


Assuntos
Doenças Transmissíveis/genética , Aprendizado de Máquina , Aminoácidos/análise , Ontologia Genética , Humanos , Redes Neurais de Computação , Mapas de Interação de Proteínas
8.
BMC Bioinformatics ; 19(Suppl 13): 549, 2019 Feb 04.
Artigo em Inglês | MEDLINE | ID: mdl-30717651

RESUMO

BACKGROUND: Malignant diseases have become a threat for health care system. A panoply of biological processes is involved as the cause of these diseases. In order to unveil the mechanistic details of these diseased states, we analyzed protein families relevant to these diseases. RESULTS: Our present study pivots around four apparently unrelated cancer types among which two are commonly occurring viz. Prostate Cancer, Breast Cancer and two relatively less frequent viz. Acute Lymphoblastic Leukemia and Lymphoma. Eight protein families were found to have implications for these cancer types. Our results strikingly reveal that some of the proteins with implications in the cancerous cellular states were showing the structural organization disparate from the signature of the family it constitutes. The sequences were further mapped onto respective structures and compared with the entropic profile. The structures reveal that entropic scores were able to reveal the inherent structural bias of these proteins with quantitative precision, otherwise unseen from other analysis. Subsequently, the betweenness centrality scoring of each residue from the structure network models was resorted to explore the changes in dependencies on residue owing to structural disorder. CONCLUSION: These observations help to obtain the mechanistic changes resulting from the structural orchestration of protein structures. Finally, the hydropathy indexes were obtained to validate the sequence space observations using Shannon entropy and in-turn establishing the compatibility.


Assuntos
Entropia , Evolução Molecular , Proteínas Intrinsicamente Desordenadas/química , Neoplasias/metabolismo , Animais , Humanos , Interações Hidrofóbicas e Hidrofílicas
9.
BMC Genet ; 19(1): 9, 2018 01 22.
Artigo em Inglês | MEDLINE | ID: mdl-29357837

RESUMO

BACKGROUND: Study of epigenetics is currently a high-impact research topic. Multi stage methylation is also an area of high-dimensional prospect. In this article, we provide a new study (intra and inter-species study) on brain tissue between human and rhesus on two methylation cytosine variants based data-profiles (viz., 5-hydroxymethylcytosine (5hmC) and 5-methylcytosine (5mC) samples) through TF-miRNA-gene network based module detection. RESULTS: First of all, we determine differentially 5hmC methylated genes for human as well as rhesus for intra-species analysis, and differentially multi-stage methylated genes for inter-species analysis. Thereafter, we utilize weighted topological overlap matrix (TOM) measure and average linkage clustering consecutively on these genesets for intra- and inter-species study.We identify co-methylated and multi-stage co-methylated gene modules by using dynamic tree cut, for intra-and inter-species cases, respectively. Each module is represented by individual color in the dendrogram. Gene Ontology and KEGG pathway based analysis are then performed to identify biological functionalities of the identified modules. Finally, top ten regulator TFs and targeter miRNAs that are associated with the maximum number of gene modules, are determined for both intra-and inter-species analysis. CONCLUSIONS: The novel TFs and miRNAs obtained from the analysis are: MYST3 and ZNF771 as TFs (for human intra-species analysis), BAZ2B, RCOR3 and ATF1 as TFs (for rhesus intra-species analysis), and mml-miR-768-3p and mml-miR-561 as miRs (for rhesus intra-species analysis); and MYST3 and ZNF771 as miRs(for inter-species study). Furthermore, the genes/TFs/miRNAs that are already found to be liable for several brain-related dreadful diseases as well as rare neglected diseases (e.g., wolf Hirschhorn syndrome, Joubarts Syndrome, Huntington's disease, Simian Immunodeficiency Virus(SIV) mediated enchaphilits, Parkinsons Disease, Bipolar disorder and Schizophenia etc.) are mentioned.


Assuntos
5-Metilcitosina/análogos & derivados , 5-Metilcitosina/análise , Encéfalo/metabolismo , Metilação de DNA , Redes Reguladoras de Genes , Macaca mulatta/genética , Animais , Humanos , Especificidade da Espécie
10.
Brief Bioinform ; 16(5): 830-51, 2015 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-25479794

RESUMO

The computational or in silico approaches for analysing the HIV-1-human protein-protein interaction (PPI) network, predicting different host cellular factors and PPIs and discovering several pathways are gaining popularity in the field of HIV research. Although there exist quite a few studies in this regard, no previous effort has been made to review these works in a comprehensive manner. Here we review the computational approaches that are devoted to the analysis and prediction of HIV-1-human PPIs. We have broadly categorized these studies into two fields: computational analysis of HIV-1-human PPI network and prediction of novel PPIs. We have also presented a comparative assessment of these studies and proposed some methodologies for discussing the implication of their results. We have also reviewed different computational techniques for predicting HIV-1-human PPIs and provided a comparative study of their applicability. We believe that our effort will provide helpful insights to the HIV research community.


Assuntos
HIV-1/metabolismo , Proteínas/metabolismo , Simulação por Computador , Humanos , Ligação Proteica
11.
J Chem Inf Model ; 55(7): 1469-82, 2015 Jul 27.
Artigo em Inglês | MEDLINE | ID: mdl-26079845

RESUMO

The Cyclin-Dependent Kinases (CDKs) are the core components coordinating eukaryotic cell division cycle. Generally the crystal structure of CDKs provides information on possible molecular mechanisms of ligand binding. However, reliable and robust estimation of ligand binding activity has been a challenging task in drug design. In this regard, various machine learning techniques, such as Support Vector Machine, Naive Bayesian classifier, Decision Tree, and K-Nearest Neighbor classifier, have been used. The performance of these heterogeneous classification techniques depends on proper selection of features from the data set. This fact motivated us to propose an integrated classification technique using Genetic Algorithm (GA), Rotational Feature Selection (RFS) scheme, and Ensemble of Machine Learning methods, named as the Genetic Algorithm integrated Rotational Ensemble based classification technique, for the prediction of ligand binding activity of CDKs. This technique can automatically find the important features and the ensemble size. For this purpose, GA encodes the features and ensemble size in a chromosome as a binary string. Such encoded features are then used to create diverse sets of training points using RFS in order to train the machine learning method multiple times. The RFS scheme works on Principal Component Analysis (PCA) to preserve the variability information of the rotational nonoverlapping subsets of original data. Thereafter, the testing points are fed to the different instances of trained machine learning method in order to produce the ensemble result. Here accuracy is computed as a final result after 10-fold cross validation, which also used as an objective function for GA to maximize. The effectiveness of the proposed classification technique has been demonstrated quantitatively and visually in comparison with different machine learning methods for 16 ligand binding CDK docking and rescoring data sets. In addition, the best possible features have been reported for CDK docking and rescoring data sets separately. Finally, the Friedman test has been conducted to judge the statistical significance of the results produced by the proposed technique. The results indicate that the integrated classification technique has high relevance in predicting of protein-ligand binding activity.


Assuntos
Quinases Ciclina-Dependentes/antagonistas & inibidores , Quinases Ciclina-Dependentes/metabolismo , Aprendizado de Máquina , Inibidores de Proteínas Quinases/metabolismo , Inibidores de Proteínas Quinases/farmacologia , Algoritmos , Teorema de Bayes , Cromossomos/genética , Quinases Ciclina-Dependentes/química , Árvores de Decisões , Modelos Moleculares , Ligação Proteica , Conformação Proteica , Máquina de Vetores de Suporte
12.
J Biomed Inform ; 57: 308-19, 2015 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-26297985

RESUMO

Gene ranking is an important problem in bioinformatics. Here, we propose a new framework for ranking biomolecules (viz., miRNAs, transcription-factors/TFs and genes) in a multi-informative uterine leiomyoma dataset having both gene expression and methylation data using (statistical) eigenvector centrality based approach. At first, genes that are both differentially expressed and methylated, are identified using Limma statistical test. A network, comprising these genes, corresponding TFs from TRANSFAC and ITFP databases, and targeter miRNAs from miRWalk database, is then built. The biomolecules are then ranked based on eigenvector centrality. Our proposed method provides better average accuracy in hub gene and non-hub gene classifications than other methods. Furthermore, pre-ranked Gene set enrichment analysis is applied on the pathway database as well as GO-term databases of Molecular Signatures Database with providing a pre-ranked gene-list based on different centrality values for comparing among the ranking methods. Finally, top novel potential gene-markers for the uterine leiomyoma are provided.


Assuntos
Bases de Dados Genéticas , Redes Reguladoras de Genes , Leiomioma/genética , MicroRNAs , Neoplasias Uterinas/genética , Animais , Feminino , Perfilação da Expressão Gênica , Genes , Fatores de Transcrição
13.
BMC Bioinformatics ; 15: 26, 2014 Jan 24.
Artigo em Inglês | MEDLINE | ID: mdl-24460683

RESUMO

BACKGROUND: Discovering novel interactions between HIV-1 and human proteins would greatly contribute to different areas of HIV research. Identification of such interactions leads to a greater insight into drug target prediction. Some recent studies have been conducted for computational prediction of new interactions based on the experimentally validated information stored in a HIV-1-human protein-protein interaction database. However, these techniques do not predict any regulatory mechanism between HIV-1 and human proteins by considering interaction types and direction of regulation of interactions. RESULTS: Here we present an association rule mining technique based on biclustering for discovering a set of rules among human and HIV-1 proteins using the publicly available HIV-1-human PPI database. These rules are subsequently utilized to predict some novel interactions among HIV-1 and human proteins. For prediction purpose both the interaction types and direction of regulation of interactions, (i.e., virus-to-host or host-to-virus) are considered here to provide important additional information about the regulation pattern of interactions. We have also studied the biclusters and analyzed the significant GO terms and KEGG pathways in which the human proteins of the biclusters participate. Moreover the predicted rules have also been analyzed to discover regulatory relationship between some human proteins in course of HIV-1 infection. Some experimental evidences of our predicted interactions have been found by searching the recent literatures in PUBMED. We have also highlighted some human proteins that are likely to act against the HIV-1 attack. CONCLUSIONS: We pose the problem of identifying new regulatory interactions between HIV-1 and human proteins based on the existing PPI database as an association rule mining problem based on biclustering algorithm. We discover some novel regulatory interactions between HIV-1 and human proteins. Significant number of predicted interactions has been found to be supported by recent literature.


Assuntos
Análise por Conglomerados , Biologia Computacional/métodos , Infecções por HIV , HIV-1 , Interações Hospedeiro-Patógeno/fisiologia , Bases de Dados Factuais , Infecções por HIV/metabolismo , Infecções por HIV/virologia , HIV-1/fisiologia , Humanos , Mapeamento de Interação de Proteínas , Proteínas/metabolismo
14.
Nucleic Acids Res ; 40(Database issue): D615-20, 2012 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-22102573

RESUMO

Pathogenic bacteria produce protein toxins to survive in the hostile environments defined by the host's defense systems and immune response. Recent progresses in high-throughput genome sequencing and structure determination techniques have contributed to a better understanding of mechanisms of action of the bacterial toxins at the cellular and molecular levels leading to pathogenicity. It is fair to assume that with time more and more unknown toxins will emerge not only by the discovery of newer species but also due to the genetic rearrangement of existing bacterial genomes. Hence, it is crucial to organize a systematic compilation and subsequent analyses of the inherent features of known bacterial toxins. We developed a Database for Bacterial ExoToxins (DBETH, http://www.hpppi.iicb.res.in/btox/), which contains sequence, structure, interaction network and analytical results for 229 toxins categorized within 24 mechanistic and activity types from 26 bacterial genuses. The main objective of this database is to provide a comprehensive knowledgebase for human pathogenic bacterial toxins where various important sequence, structure and physico-chemical property based analyses are provided. Further, we have developed a prediction server attached to this database which aims to identify bacterial toxin like sequences either by establishing homology with known toxin sequences/domains or by classifying bacterial toxin specific features using a support vector based machine learning techniques.


Assuntos
Proteínas de Bactérias/química , Toxinas Bacterianas/química , Bases de Dados de Proteínas , Exotoxinas/química
15.
Int J Neural Syst ; : 2450063, 2024 Aug 31.
Artigo em Inglês | MEDLINE | ID: mdl-39212940

RESUMO

In many modern machine learning (ML) models, attention mechanisms (AMs) play a crucial role in processing data and identifying significant parts of the inputs, whether these are text or images. This selective focus enables subsequent stages of the model to achieve improved classification performance. Traditionally, AMs are applied as a preprocessing substructure before a neural network, such as in encoder/decoder architectures. In this paper, we extend the application of AMs to intermediate stages of data propagation within ML models. Specifically, we propose a generalized attention mechanism (GAM), which can be integrated before each layer of a neural network for classification tasks. The proposed GAM allows for at each layer/step of the ML architecture identification of the most relevant sections of the intermediate results. Our experimental results demonstrate that incorporating the proposed GAM into various ML models consistently enhances the accuracy of these models. This improvement is achieved with only a marginal increase in the number of parameters, which does not significantly affect the training time.

16.
Comput Biol Med ; 163: 107182, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37379615

RESUMO

Over the last couple of decades, the introduction and proliferation of whole-slide scanners led to increasing interest in the research of digital pathology. Although manual analysis of histopathological images is still the gold standard, the process is often tedious and time consuming. Furthermore, manual analysis also suffers from intra- and interobserver variability. Separating structures or grading morphological changes can be difficult due to architectural variability of these images. Deep learning techniques have shown great potential in histopathology image segmentation that drastically reduces the time needed for downstream tasks of analysis and providing accurate diagnosis. However, few algorithms have clinical implementations. In this paper, we propose a new deep learning model Dense Dilated Multiscale Supervised Attention-Guided (D2MSA) Network for histopathology image segmentation that makes use of deep supervision coupled with a hierarchical system of novel attention mechanisms. The proposed model surpasses state-of-the-art performance while using similar computational resources. The performance of the model has been evaluated for the tasks of gland segmentation and nuclei instance segmentation, both of which are clinically relevant tasks to assess the state and progress of malignancy. Here, we have used histopathology image datasets for three different types of cancer. We have also performed extensive ablation tests and hyperparameter tuning to ensure the validity and reproducibility of the model performance. The proposed model is available at www.github.com/shirshabose/D2MSA-Net.


Assuntos
Processamento de Imagem Assistida por Computador , Neoplasias , Humanos , Reprodutibilidade dos Testes , Processamento de Imagem Assistida por Computador/métodos , Algoritmos , Neoplasias/diagnóstico por imagem , Variações Dependentes do Observador
17.
Sci Rep ; 13(1): 22555, 2023 12 18.
Artigo em Inglês | MEDLINE | ID: mdl-38110462

RESUMO

Breast cancer is one of the most common cancers in women and the second foremost cause of cancer death in women after lung cancer. Recent technological advances in breast cancer treatment offer hope to millions of women in the world. Segmentation of the breast's Dynamic Contrast-Enhanced Magnetic Resonance Imaging (DCE-MRI) is one of the necessary tasks in the diagnosis and detection of breast cancer. Currently, a popular deep learning model, U-Net is extensively used in biomedical image segmentation. This article aims to advance the state of the art and conduct a more in-depth analysis with a focus on the use of various U-Net models in lesion detection in women's breast DCE-MRI. In this article, we perform an empirical study of the effectiveness and efficiency of U-Net and its derived deep learning models including ResUNet, Dense UNet, DUNet, Attention U-Net, UNet++, MultiResUNet, RAUNet, Inception U-Net and U-Net GAN for lesion detection in breast DCE-MRI. All the models are applied to the benchmarked 100 Sagittal T2-Weighted fat-suppressed DCE-MRI slices of 20 patients and their performance is compared. Also, a comparative study has been conducted with V-Net, W-Net, and DeepLabV3+. Non-parametric statistical test Wilcoxon Signed Rank Test is used to analyze the significance of the quantitative results. Furthermore, Multi-Criteria Decision Analysis (MCDA) is used to evaluate overall performance focused on accuracy, precision, sensitivity, F[Formula: see text]-score, specificity, Geometric-Mean, DSC, and false-positive rate. The RAUNet segmentation model achieved a high accuracy of 99.76%, sensitivity of 85.04%, precision of 90.21%, and Dice Similarity Coefficient (DSC) of 85.04% whereas ResNet achieved 99.62% accuracy, 62.26% sensitivity, 99.56% precision, and 72.86% DSC. ResUNet is found to be the most effective model based on MCDA. On the other hand, U-Net GAN takes the least computational time to perform the segmentation task. Both quantitative and qualitative results demonstrate that the ResNet model performs better than other models in segmenting the images and lesion detection, though computational time in achieving the objectives varies.


Assuntos
Neoplasias da Mama , Aprendizado Profundo , Humanos , Feminino , Processamento de Imagem Assistida por Computador/métodos , Imageamento por Ressonância Magnética/métodos , Mama/diagnóstico por imagem , Mama/patologia , Neoplasias da Mama/patologia
18.
Front Genet ; 14: 1095330, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36865387

RESUMO

In this current era, biomedical big data handling is a challenging task. Interestingly, the integration of multi-modal data, followed by significant feature mining (gene signature detection), becomes a daunting task. Remembering this, here, we proposed a novel framework, namely, three-factor penalized, non-negative matrix factorization-based multiple kernel learning with soft margin hinge loss (3PNMF-MKL) for multi-modal data integration, followed by gene signature detection. In brief, limma, employing the empirical Bayes statistics, was initially applied to each individual molecular profile, and the statistically significant features were extracted, which was followed by the three-factor penalized non-negative matrix factorization method used for data/matrix fusion using the reduced feature sets. Multiple kernel learning models with soft margin hinge loss had been deployed to estimate average accuracy scores and the area under the curve (AUC). Gene modules had been identified by the consecutive analysis of average linkage clustering and dynamic tree cut. The best module containing the highest correlation was considered the potential gene signature. We utilized an acute myeloid leukemia cancer dataset from The Cancer Genome Atlas (TCGA) repository containing five molecular profiles. Our algorithm generated a 50-gene signature that achieved a high classification AUC score (viz., 0.827). We explored the functions of signature genes using pathway and Gene Ontology (GO) databases. Our method outperformed the state-of-the-art methods in terms of computing AUC. Furthermore, we included some comparative studies with other related methods to enhance the acceptability of our method. Finally, it can be notified that our algorithm can be applied to any multi-modal dataset for data integration, followed by gene module discovery.

19.
Amino Acids ; 43(2): 583-94, 2012 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-21993537

RESUMO

In this article, we categorize presently available experimental and theoretical knowledge of various physicochemical and biochemical features of amino acids, as collected in the AAindex database of known 544 amino acid (AA) indices. Previously reported 402 indices were categorized into six groups using hierarchical clustering technique and 142 were left unclustered. However, due to the increasing diversity of the database these indices are overlapping, therefore crisp clustering method may not provide optimal results. Moreover, in various large-scale bioinformatics analyses of whole proteomes, the proper selection of amino acid indices representing their biological significance is crucial for efficient and error-prone encoding of the short functional sequence motifs. In most cases, researchers perform exhaustive manual selection of the most informative indices. These two facts motivated us to analyse the widely used AA indices. The main goal of this article is twofold. First, we present a novel method of partitioning the bioinformatics data using consensus fuzzy clustering, where the recently proposed fuzzy clustering techniques are exploited. Second, we prepare three high quality subsets of all available indices. Superiority of the consensus fuzzy clustering method is demonstrated quantitatively, visually and statistically by comparing it with the previously proposed hierarchical clustered results. The processed AAindex1 database, supplementary material and the software are available at http://sysbio.icm.edu.pl/aaindex/ .


Assuntos
Algoritmos , Aminoácidos/química , Interpretação Estatística de Dados , Modelos Moleculares , Aminoácidos/classificação , Análise por Conglomerados , Interações Hidrofóbicas e Hidrofílicas , Ponto Isoelétrico , Peso Molecular
20.
Trans Indian Natl Acad Eng ; 7(3): 927-941, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35836615

RESUMO

Intelligent Transport System should be renovated in many aspects in post-pandemic situation like COVID-19. The passenger-count inside a car will be restricted based on the vehicle capacity and the COVID-19 hot-spot zone. Traffic rules will be impacted to align with a similar contagious outbreak. The on-road 'Yellow-Vulture' cameras need to incorporate such surveillance rules to monitor related anomalies for preventing contamination. To maintain safe-distance, an automatic surveillance system will be preferred by the Government very soon. Moreover, facial mask usage during the journey has become an essential habit to stop the spread of the infection. In this article, we have proposed a deep-Learning based framework that employs an augmented image data set to provide proper surveillance in the transport system to maintain the health protocols. Fast and accurate detection of the number of passengers inside a car and their face masks from the traffic inspection camera feed has been demonstrated. We have exploited the advantages of the popular Transfer Learning approach with novel variations of images while performing the training. To the best of our knowledge, this is the first attempt to watch over in-vehicle social-distancing in post-pandemic circumstances through deep-Learning based image analysis. The superiority of the proposed framework has been established over several state-of-the-art techniques using different numerical metrics and visual comparisons along with a support of statistical hypothesis test. Our technique has achieved 98.5 % testing accuracy in various adverse conditions. Zero-shot evaluation has been explored for the Real-Time-Medical-Mask-Detection data set Wang et al. (Real-Time-Medical-Mask-Detection, 2020a https://github.com/TheSSJ2612/Real-Time-Medical-Mask-Detection/, Accessed 14 Nov 2020), where we have attained 96.4 % accuracy that manifests the generalization of the network.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA