Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 52
Filtrar
1.
Proteins ; 2024 Sep 06.
Artigo em Inglês | MEDLINE | ID: mdl-39239684

RESUMO

Phosphorylation is a substantial posttranslational modification of proteins that refers to adding a phosphate group to the amino acid side chain after translation process in the ribosome. It is vital to coordinate cellular functions, such as regulating metabolism, proliferation, apoptosis, subcellular trafficking, and other crucial physiological processes. Phosphorylation prediction in a microbial organism can assist in understanding pathogenesis and host-pathogen interaction, drug and antibody design, and antimicrobial agent development. Experimental methods for predicting phosphorylation sites are costly, slow, and tedious. Hence low-cost and high-speed computational approaches are highly desirable. This paper presents a new deep learning tool called DeepPhoPred for predicting microbial phospho-serine (pS), phospho-threonine (pT), and phospho-tyrosine (pY) sites. DeepPhoPred incorporates a two-headed convolutional neural network architecture with the squeeze and excitation blocks followed by fully connected layers that jointly learn significant features from the peptide's structural and evolutionary information to predict phosphorylation sites. Our empirical results demonstrate that DeepPhoPred significantly outperforms the existing microbial phosphorylation site predictors with its highly efficient deep-learning architecture. DeepPhoPred as a standalone predictor, all its source codes, and our employed datasets are publicly available at https://github.com/faisalahm3d/DeepPhoPred.

2.
Bioinformatics ; 38(15): 3717-3724, 2022 08 02.
Artigo em Inglês | MEDLINE | ID: mdl-35731219

RESUMO

MOTIVATION: Advances in sequencing technologies have led to the sequencing of genomes of a multitude of organisms. However, draft genomes of many of these organisms contain a large number of gaps due to the repeats in genomes, low sequencing coverage and limitations in sequencing technologies. Although there exists several tools for filling gaps, many of these do not utilize all information relevant to gap filling. RESULTS: Here, we present a probabilistic method for filling gaps in draft genome assemblies using second-generation reads based on a generative model for sequencing that takes into account information on insert sizes and sequencing errors. Our method is based on the expectation-maximization algorithm unlike the graph-based methods adopted in the literature. Experiments on real biological datasets show that this novel approach can fill up large portions of gaps with small number of errors and misassemblies compared to other state-of-the-art gap-filling tools. AVAILABILITY AND IMPLEMENTATION: The method is implemented using C++ in a software named 'Filling Gaps by Iterative Read Distribution (Figbird)', which is available at https://github.com/SumitTarafder/Figbird. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Software , Análise de Sequência de DNA/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Algoritmos , Genoma
3.
Genomics ; 114(2): 110264, 2022 03.
Artigo em Inglês | MEDLINE | ID: mdl-34998929

RESUMO

Cancer is one of the major causes of human death per year. In recent years, cancer identification and classification using machine learning have gained momentum due to the availability of high throughput sequencing data. Using RNA-seq, cancer research is blooming day by day and new insights of cancer and related treatments are coming into light. In this paper, we propose PanClassif, a method that requires a very few and effective genes to detect cancer from RNA-seq data and is able to provide performance gain in several wide range machine learning classifiers. We have taken 22 types of cancer samples from The Cancer Genome Atlas (TCGA) having 8287 cancer samples and 680 normal samples. Firstly, PanClassif uses k-Nearest Neighbour (k-NN) smoothing to smooth the samples to handle noise in the data. Then effective genes are selected by Anova based test. For balancing the train data, PanClassif applies an oversampling method, SMOTE. We have performed comprehensive experiments on the datasets using several classification algorithms. Experimental results shows that PanClassif outperform existing state-of-the-art methods available and shows consistent performance for two single cell RNA-seq datasets taken from Gene Expression Omnibus (GEO). PanClassif improves performances of a wide variety of classifiers for both binary cancer prediction and multi-class cancer classification. PanClassif is available as a python package (https://pypi.org/project/panclassif/). All the source code and materials of PanClassif are available at https://github.com/Zwei-inc/panclassif.


Assuntos
Aprendizado de Máquina , Neoplasias , Algoritmos , Expressão Gênica , Perfilação da Expressão Gênica , Humanos , Neoplasias/diagnóstico , Neoplasias/genética , RNA-Seq , Análise de Sequência de RNA/métodos , Software
4.
Bioinformatics ; 36(19): 4869-4875, 2020 12 08.
Artigo em Inglês | MEDLINE | ID: mdl-32614400

RESUMO

MOTIVATION: Promoter is a short region of DNA which is responsible for initiating transcription of specific genes. Development of computational tools for automatic identification of promoters is in high demand. According to the difference of functions, promoters can be of different types. Promoters may have both intra- and interclass variation and similarity in terms of consensus sequences. Accurate classification of various types of sigma promoters still remains a challenge. RESULTS: We present iPromoter-BnCNN for identification and accurate classification of six types of promoters-σ24,σ28,σ32,σ38,σ54,σ70. It is a CNN-based classifier which combines local features related to monomer nucleotide sequence, trimer nucleotide sequence, dimer structural properties and trimer structural properties through the use of parallel branching. We conducted experiments on a benchmark dataset and compared with six state-of-the-art tools to show our supremacy on 5-fold cross-validation. Moreover, we tested our classifier on an independent test dataset. AVAILABILITY AND IMPLEMENTATION: Our proposed tool iPromoter-BnCNN web server is freely available at http://103.109.52.8/iPromoter-BnCNN. The runnable source code can be found https://colab.research.google.com/drive/1yWWh7BXhsm8U4PODgPqlQRy23QGjF2DZ. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Fator sigma , Software , DNA , Regiões Promotoras Genéticas , Análise de Sequência de DNA
5.
Genomics ; 112(3): 2583-2589, 2020 05.
Artigo em Inglês | MEDLINE | ID: mdl-32068122

RESUMO

Knowledge of the sub-cellular localization of the most diverse class of transcribed RNA, long non-coding RNAs (lncRNAs) will lead us to identify different types of cancers and other diseases as lncRNAs play key role in related cellular functions. In recent days with the exponential growth of known records, it becomes essential to establish new machine learning based techniques to identify the new one due to faster and cheaper solutions provided compared to laboratory methods. In this paper, we propose Locate-R, a novel method for predicting the sub-cellular location of lncRNAs. We have used only n-gapped l-mer composition and l-mer composition as features and select best 655 features to build the model. This model is based locally deep support vector machines which significantly enhance the prediction accuracy with respect to exiting state-of-the-art methods. Our predictor is readily available for use as a stand-alone web application from: http://locate-r.azurewebsites.net/.


Assuntos
RNA Longo não Codificante/análise , Software , Núcleo Celular/genética , Citoplasma/genética , Exossomos/genética , Internet , Nucleotídeos/química , RNA Longo não Codificante/química , Ribossomos/genética , Análise de Sequência de RNA , Máquina de Vetores de Suporte
6.
Bioinformatics ; 35(19): 3831-3833, 2019 10 01.
Artigo em Inglês | MEDLINE | ID: mdl-30850831

RESUMO

MOTIVATION: Extracting useful feature set which contains significant discriminatory information is a critical step in effectively presenting sequence data to predict structural, functional, interaction and expression of proteins, DNAs and RNAs. Also, being able to filter features with significant information and avoid sparsity in the extracted features require the employment of efficient feature selection techniques. Here we present PyFeat as a practical and easy to use toolkit implemented in Python for extracting various features from proteins, DNAs and RNAs. To build PyFeat we mainly focused on extracting features that capture information about the interaction of neighboring residues to be able to provide more local information. We then employ AdaBoost technique to select features with maximum discriminatory information. In this way, we can significantly reduce the number of extracted features and enable PyFeat to represent the combination of effective features from large neighboring residues. As a result, PyFeat is able to extract features from 13 different techniques and represent context free combination of effective features. The source code for PyFeat standalone toolkit and employed benchmarks with a comprehensive user manual explaining its system and workflow in a step by step manner are publicly available. RESULTS: https://github.com/mrzResearchArena/PyFeat/blob/master/RESULTS.md. AVAILABILITY AND IMPLEMENTATION: Toolkit, source code and manual to use PyFeat: https://github.com/mrzResearchArena/PyFeat/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Software , Sequência de Aminoácidos , DNA , Proteínas , RNA
7.
Curr Genomics ; 21(3): 194-203, 2020 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-33071613

RESUMO

A variety of protein post-translational modifications has been identified that control many cellular functions. Phosphorylation studies in mycobacterial organisms have shown critical importance in diverse biological processes, such as intercellular communication and cell division. Recent technical advances in high-precision mass spectrometry have determined a large number of microbial phosphorylated proteins and phosphorylation sites throughout the proteome analysis. Identification of phosphorylated proteins with specific modified residues through experimentation is often labor-intensive, costly and time-consuming. All these limitations could be overcome through the application of machine learning (ML) approaches. However, only a limited number of computational phosphorylation site prediction tools have been developed so far. This work aims to present a complete survey of the existing ML-predictors for microbial phosphorylation. We cover a variety of important aspects for developing a successful predictor, including operating ML algorithms, feature selection methods, window size, and software utility. Initially, we review the currently available phosphorylation site databases of the microbiome, the state-of-the-art ML approaches, working principles, and their performances. Lastly, we discuss the limitations and future directions of the computational ML methods for the prediction of phosphorylation.

8.
Genomics ; 111(4): 966-972, 2019 07.
Artigo em Inglês | MEDLINE | ID: mdl-29935224

RESUMO

Recombination hotspots in a genome are unevenly distributed. Hotspots are regions in a genome that show higher rates of meiotic recombinations. Computational methods for recombination hotspot prediction often use sophisticated features that are derived from physico-chemical or structure based properties of nucleotides. In this paper, we propose iRSpot-SF that uses sequence based features which are computationally cheap to generate. Four feature groups are used in our method: k-mer composition, gapped k-mer composition, TF-IDF of k-mers and reverse complement k-mer composition. We have used recursive feature elimination to select 17 top features for hotspot prediction. Our analysis shows the superiority of gapped k-mer composition and reverse complement k-mer composition features over others. We have used SVM with RBF kernel as a classification algorithm. We have tested our algorithm on standard benchmark datasets. Compared to other methods iRSpot-SF is able to produce significantly better results in terms of accuracy, Mathew's Correlation Coefficient and sensitivity which are 84.58%, 0.6941 and 84.57%. We have made our method readily available to use as a python based tool and made the datasets and source codes available at: https://github.com/abdlmaruf/iRSpot-SF. An web application is developed based on iRSpot-SF and freely available to use at: http://irspot.pythonanywhere.com/server.html.


Assuntos
Recombinação Genética , Análise de Sequência de DNA/métodos , Software , Animais , Humanos , Motivos de Nucleotídeos
9.
Genomics ; 111(5): 1160-1166, 2019 09.
Artigo em Inglês | MEDLINE | ID: mdl-30059731

RESUMO

Sigma promoter sequences in bacterial genomes are important due to their role in transcription initiation. Sigma 70 is one of the most important and crucial sigma factors. In this paper, we address the problem of identification of σ70 promoter sequences in bacterial genome. We propose iPromoter-FSEn, a novel predictor for identification of σ70 promoter sequences. Our proposed method is based on a feature subspace based ensemble classifier. A large set of of features extracted from the sequence of nucleotides are divided into subsets and each subset is given to individual single classifiers to learn. Based on the decisions of the ensemble an aggregate decision is made by the ensemble voting classifier. We tested our method on a standard benchmark dataset extracted from experimentally validated results. Experimental results shows that iPromoter-FSEn significantly improves over the state-of-the art σ70 promoter sequence predictors. The accuracy and area under receiver operating characteristic curve of iPromoter-FSEn are 86.32% and 0.9319 respectively. We have also made our method readily available for use as an web application from: http://ipromoterfsen.pythonanywhere.com/server.


Assuntos
DNA Bacteriano/genética , Regiões Promotoras Genéticas , Análise de Sequência de DNA/métodos , Software , Bactérias/genética , DNA Bacteriano/química , Ligação Proteica , Fator sigma/metabolismo
10.
Mol Genet Genomics ; 294(1): 69-84, 2019 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-30187132

RESUMO

In bacterial DNA, there are specific sequences of nucleotides called promoters that can bind to the RNA polymerase. Sigma70 ([Formula: see text]) is one of the most important promoter sequences due to its presence in most of the DNA regulatory functions. In this paper, we identify the most effective and optimal sequence-based features for prediction of [Formula: see text] promoter sequences in a bacterial genome. We used both short-range and long-range DNA sequences in our proposed method. A very small number of effective features are selected from a large number of the extracted features using multi-window of different sizes within the DNA sequences. We call our prediction method iPro70-FMWin and made it freely accessible online via a web application established at http://ipro70.pythonanywhere.com/server for the sake of convenience of the researchers. We have tested our method using a standard benchmark dataset. In the experiments, iPro70-FMWin has achieved an area under the curve of the receiver operating characteristic and accuracy of 0.959 and 90.57%, respectively, which significantly outperforms the state-of-the-art predictors.


Assuntos
Bactérias/enzimologia , RNA Polimerases Dirigidas por DNA/genética , Genômica/métodos , Fator sigma/genética , Algoritmos , Bactérias/genética , Genoma Bacteriano , Internet , Regiões Promotoras Genéticas , Curva ROC , Software , Máquina de Vetores de Suporte
11.
Anal Biochem ; 569: 16-21, 2019 03 15.
Artigo em Inglês | MEDLINE | ID: mdl-30664849

RESUMO

RNA editing process like Adenosine to Intosine (A-to-I) often influences basic functions like splicing stability and most importantly the translation. Thus knowledge about editing sites is of great importance in molecular biology. With the growth of known editing sites, machine learning or data centric approaches are now being applied to solve this problem of prediction of RNA editing sites. In this paper, we propose EPAI-NC, a novel method for prediction of RNA editing sites. We have used l-mer composition and n-gapped l-mer composition as features and used Pearson Correlation Coefficient to select features according to Pareto Principle. Locally deep support vector machines were used to train the classification model of EPAI-NC. EPAI-NC significantly enhances the prediction accuracy compared to the previous state-of-the-art methods when tested on standard benchmark and independent dataset.


Assuntos
Adenosina/metabolismo , Inosina/metabolismo , Aprendizado de Máquina , Edição de RNA , RNA/metabolismo , Área Sob a Curva , Curva ROC , Interface Usuário-Computador
12.
J Theor Biol ; 460: 64-78, 2019 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-30316822

RESUMO

DNA-binding proteins (DBPs) are responsible for several cellular functions, starting from our immunity system to the transport of oxygen. In the recent studies, scientists have used supervised machine learning based methods that use information from the protein sequence only to classify the DBPs. Most of the methods work effectively on the train sets but performance of most of them degrades in the independent test set. It shows a room for improving the prediction method by reducing over-fitting. In this paper, we have extracted several features solely using the protein sequence and carried out two different types of feature selection on them. Our results have proven comparable on training set and significantly improved on the independent test set. On the independent test set our accuracy was 82.26% which is 1.62% improved compared to the previous best state-of-the-art methods. Performance in terms of sensitivity and area under receiver operating characteristic curve for the independent test set was also higher and they were 0.95 and 0.823 respectively.


Assuntos
Proteínas de Ligação a DNA/química , Máquina de Vetores de Suporte , Algoritmos , Sequência de Aminoácidos , Biologia Computacional/métodos , Proteínas de Ligação a DNA/classificação , Curva ROC , Reprodutibilidade dos Testes
13.
J Theor Biol ; 464: 1-8, 2019 03 07.
Artigo em Inglês | MEDLINE | ID: mdl-30578798

RESUMO

Drug target interaction prediction is a very labor-intensive and expensive experimental process which has motivated researchers to focus on in silico prediction to provide information on potential interaction. In recent years, researchers have proposed several computational approaches for predicting new drug target interactions. In this paper, we present CFSBoost, a simple and computationally cheap ensemble boosting classification model for identification and prediction of drug-target interactions using evolutionary and structural features. CFSBoost uses a simple yet novel feature group selection procedure which allows the model to be computationally very cheap while being able to achieve state of the art performance. The ensemble model uses extra tree as weak learners inside a boosting scheme while holding on to the best model per iteration. We tested our method of four benchmark datasets, which are also referred as gold standard datasets. Our method was able to achieve better score in terms of area under receiver operating characteristic (auROC) curve on 2 out of the 4 datasets. It was also able to achieve higher area under precision recall (auPR) curve on 3 out of the 4 datasets. It has been argued by researchers that auPR metric is more suitable than auROC for comparison of performance on imbalanced datasets such our benchmark datasets. Our reported result shows that, despite of its simplicity in design, CFSBoost's performance is very satisfactory comparing to other literatures. We also provide 5 new possible interactions for each dataset based on CFSBoost's prediction score.


Assuntos
Algoritmos , Biologia Computacional , Simulação por Computador , Descoberta de Drogas , Modelos Químicos , Humanos
14.
Proteins ; 86(7): 777-789, 2018 07.
Artigo em Inglês | MEDLINE | ID: mdl-29675975

RESUMO

Glycation is chemical reaction by which sugar molecule bonds with a protein without the help of enzymes. This is often cause to many diseases and therefore the knowledge about glycation is very important. In this paper, we present iProtGly-SS, a protein lysine glycation site identification method based on features extracted from sequence and secondary structural information. In the experiments, we found the best feature groups combination: Amino Acid Composition, Secondary Structure Motifs, and Polarity. We used support vector machine classifier to train our model and used an optimal set of features using a group based forward feature selection technique. On standard benchmark datasets, our method is able to significantly outperform existing methods for glycation prediction. A web server for iProtGly-SS is implemented and publicly available to use: http://brl.uiu.ac.bd/iprotgly-ss/.


Assuntos
Lisina/química , Bases de Dados de Proteínas , Estrutura Secundária de Proteína , Análise de Sequência de Proteína , Máquina de Vetores de Suporte
15.
J Theor Biol ; 443: 138-146, 2018 04 14.
Artigo em Inglês | MEDLINE | ID: mdl-29421211

RESUMO

Determining subcellular localization of proteins is considered as an important step towards understanding their functions. Previous studies have mainly focused solely on Gene Ontology (GO) as the main feature to tackle this problem. However, it was shown that features extracted based on GO is hard to be used for new proteins with unknown GO. At the same time, evolutionary information extracted from Position Specific Scoring Matrix (PSSM) have been shown as another effective features to tackle this problem. Despite tremendous advancement using these sources for feature extraction, this problem still remains unsolved. In this study we propose EvoStruct-Sub which employs predicted structural information in conjunction with evolutionary information extracted directly from the protein sequence to tackle this problem. To do this we use several different feature extraction method that have been shown promising in subcellular localization as well as similar studies to extract effective local and global discriminatory information. We then use Support Vector Machine (SVM) as our classification technique to build EvoStruct-Sub. As a result, we are able to enhance Gram-positive subcellular localization prediction accuracies by up to 5.6% better than previous studies including the studies that used GO for feature extraction.


Assuntos
Proteínas de Bactérias/genética , Biologia Computacional , Bases de Dados de Proteínas , Ontologia Genética , Bactérias Gram-Positivas/genética , Máquina de Vetores de Suporte
16.
J Theor Biol ; 452: 22-34, 2018 09 07.
Artigo em Inglês | MEDLINE | ID: mdl-29753757

RESUMO

A DNA-binding protein (DNA-BP) is a protein that can bind and interact with a DNA. Identification of DNA-BPs using experimental methods is expensive as well as time consuming. As such, fast and accurate computational methods are sought for predicting whether a protein can bind with a DNA or not. In this paper, we focus on building a new computational model to identify DNA-BPs in an efficient and accurate way. Our model extracts meaningful information directly from the protein sequences, without any dependence on functional domain or structural information. After feature extraction, we have employed Random Forest (RF) model to rank the features. Afterwards, we have used Recursive Feature Elimination (RFE) method to extract an optimal set of features and trained a prediction model using Support Vector Machine (SVM) with linear kernel. Our proposed method, named as DNA-binding Protein Prediction model using Chou's general PseAAC (DPP-PseAAC), demonstrates superior performance compared to the state-of-the-art predictors on standard benchmark dataset. DPP-PseAAC achieves accuracy values of 93.21%, 95.91% and 77.42% for 10-fold cross-validation test, jackknife test and independent test respectively. The source code of DPP-PseAAC, along with relevant dataset and detailed experimental results, can be found at https://github.com/srautonu/DNABinding. A publicly accessible web interface has also been established at: http://77.68.43.135:8080/DPP-PseAAC/.


Assuntos
Algoritmos , Biologia Computacional/métodos , Proteínas de Ligação a DNA/metabolismo , Máquina de Vetores de Suporte , Sequência de Aminoácidos , Aminoácidos/química , Aminoácidos/genética , Aminoácidos/metabolismo , DNA/química , DNA/genética , DNA/metabolismo , Proteínas de Ligação a DNA/química , Proteínas de Ligação a DNA/genética , Bases de Dados de Proteínas , Modelos Moleculares , Conformação de Ácido Nucleico , Domínios Proteicos , Reprodutibilidade dos Testes
17.
J Theor Biol ; 435: 229-237, 2017 12 21.
Artigo em Inglês | MEDLINE | ID: mdl-28943403

RESUMO

Bacteriophage proteins are viruses that can significantly impact on the functioning of bacteria and can be used in phage based therapy. The functioning of Bacteriophage in the host bacteria depends on its location in those host cells. It is very important to know the subcellular location of the phage proteins in a host cell in order to understand their working mechanism. In this paper, we propose iPHLoc-ES, a prediction method for subcellular localization of bacteriophage proteins. We aim to solve two problems: discriminating between host located and non-host located phage proteins and discriminating between the locations of host located protein in a host cell (membrane or cytoplasm). To do this, we extract sets of evolutionary and structural features of phage protein and employ Support Vector Machine (SVM) as our classifier. We also use recursive feature elimination (RFE) to reduce the number of features for effective prediction. On standard dataset using standard evaluation criteria, our method significantly outperforms the state-of-the-art predictor. iPHLoc-ES is readily available to use as a standalone tool from: https://github.com/swakkhar/iPHLoc-ES/ and as a web application from: http://brl.uiu.ac.bd/iPHLoc-ES/.


Assuntos
Bacteriófagos/química , Compartimento Celular , Máquina de Vetores de Suporte/normas , Proteínas Virais/metabolismo , Evolução Molecular , Interações Hospedeiro-Patógeno , Espaço Intracelular/virologia , Modelos Biológicos , Proteínas Virais/genética
18.
PeerJ ; 12: e16762, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38274328

RESUMO

Background: Global prevalence of neurodegenerative diseases such as Alzheimer's disease and Parkinson's disease is increasing gradually, whereas approvals of successful therapeutics for central nervous system disorders are inadequate. Accumulating evidence suggests pivotal roles of the receptor-interacting serine/threonine-protein kinase 1 (RIPK1) in modulating neuroinflammation and necroptosis. Discoveries of potent small molecule inhibitors for RIPK1 with favorable pharmacokinetic properties could thus address the unmet medical needs in treating neurodegeneration. Methods: In a structure-based virtual screening, we performed site-specific molecular docking of 4,858 flavonoids against the kinase domain of RIPK1 using AutoDock Vina. We predicted physicochemical descriptors of the top ligands using the SwissADME webserver. Binding interactions of the best ligands and the reference ligand L8D were validated using replicated 500-ns Gromacs molecular dynamics simulations and free energy calculations. Results: From Vina docking, we shortlisted the top 20 flavonoids with the highest binding affinities, ranging from -11.7 to -10.6 kcal/mol. Pharmacokinetic profiling narrowed down the list to three orally bioavailable and blood-brain-barrier penetrant flavonoids: Nitiducarpin, Pinocembrin 7-O-benzoate, and Paratocarpin J. Next, trajectories of molecular dynamics simulations of the top protein-ligand complexes were analyzed for binding interactions. The root-mean-square deviation (RMSD) was 1.191 Å (±0.498 Å), 1.725 Å (±0.828 Å), 1.923 Å (±0.942 Å), 0.972 Å (±0.155 Å) for Nitiducarpin, Pinocembrin 7-O-benzoate, Paratocarpin J, and L8D, respectively. The radius of gyration (Rg) was 2.034 nm (±0.015 nm), 2.0.39 nm (± 0.025 nm), 2.053 nm (±0.021 nm), 2.037 nm (±0.016 nm) for Nitiducarpin, Pinocembrin 7-O-benzoate, Paratocarpin J, and L8D, respectively. The solvent accessible surface area (SASA) was 159.477 nm2 (±3.021 nm2), 159.661 nm2 (± 3.707 nm2), 160.755 nm2 (±4.252 nm2), 156.630 nm2 (±3.521 nm2), for Nitiducarpin, Pinocembrin 7-O-benzoate, Paratocarpin J, and L8D complexes, respectively. Therefore, lower RMSD, Rg, and SASA values demonstrated that Nitiducarpin formed the most stable complex with the target protein among the best three ligands. Finally, 2D protein-ligand interaction analysis revealed persistent hydrophobic interactions of Nitiducarpin with the critical residues of RIPK1, including the catalytic triads and the activation loop residues, implicated in the kinase activity and ligand binding. Conclusion: Our target-based virtual screening identified three flavonoids as strong RIPK1 inhibitors, with Nitiducarpin exhibiting the most potent inhibitory potential. Future in vitro and in vivo studies with these ligands could offer new hope for developing effective therapeutics and improving the quality of life for individuals affected by neurodegeneration.


Assuntos
Flavonoides , Qualidade de Vida , Humanos , Simulação de Acoplamento Molecular , Flavonoides/farmacologia , Ligantes , Benzoatos , Proteína Serina-Treonina Quinases de Interação com Receptores
19.
Data Brief ; 52: 109938, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38173982

RESUMO

Along with the traditional news publishing policies, news agencies now share news over the internet since people nowadays prefer reading news online. Moreover, news media maintain YouTube channels to publish visual stories. Readers comment to share their opinions below the corresponding news item. These news and comments have been a great source of information and research. However, there is a lack of research in the Bengali news context. This article presents a dataset containing 7,62,678 public comments and replies from 16,016 video news published from 2017 to 2023 from a renowned Bengali news YouTube channel. The data withholds 15 properties of news that include video URL, title, likes, views, date of publishing, hashtags, description, comment author, comment time, comment, likes in the comment, reply author, reply time, reply, and likes in the responses. To ensure privacy, the commentator's name is encoded in the dataset. The dataset is open to use for researchers at https://data.mendeley.com/datasets/3c3j3bkxvn/4. A translated file for the raw dataset is also included. This data may help scholars to identify patterns in public opinion and analyze how public opinion changes over time.

20.
Med Biol Eng Comput ; 62(9): 2769-2783, 2024 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-38700613

RESUMO

Neurodegenerative diseases often exhibit a strong link with sleep disruption, highlighting the importance of effective sleep stage monitoring. In this light, automatic sleep stage classification (ASSC) plays a pivotal role, now more streamlined than ever due to the advancements in deep learning (DL). However, the opaque nature of DL models can be a barrier in their clinical adoption, due to trust concerns among medical practitioners. To bridge this gap, we introduce SleepBoost, a transparent multi-level tree-based ensemble model specifically designed for ASSC. Our approach includes a crafted feature engineering block (FEB) that extracts 41 time and frequency domain features, out of which 23 are selected based on their high mutual information score (> 0.23). Uniquely, SleepBoost integrates three fundamental linear models into a cohesive multi-level tree structure, further enhanced by a novel reward-based adaptive weight allocation mechanism. Tested on the Sleep-EDF-20 dataset, SleepBoost demonstrates superior performance with an accuracy of 86.3%, F1-score of 80.9%, and Cohen kappa score of 0.807, outperforming leading DL models in ASSC. An ablation study underscores the critical role of our selective feature extraction in enhancing model accuracy and interpretability, crucial for clinical settings. This innovative approach not only offers a more transparent alternative to traditional DL models but also extends potential implications for monitoring and understanding sleep patterns in the context of neurodegenerative disorders. The open-source availability of SleepBoost's implementation at https://github.com/akibzaman/SleepBoost can further facilitate its accessibility and potential for widespread clinical adoption.


Assuntos
Fases do Sono , Humanos , Fases do Sono/fisiologia , Eletroencefalografia/métodos , Algoritmos , Aprendizado Profundo , Polissonografia/métodos
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa