Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 88
Filtrar
Más filtros











Base de datos
Intervalo de año de publicación
1.
Nucleic Acids Res ; 2024 Sep 13.
Artículo en Inglés | MEDLINE | ID: mdl-39271121

RESUMEN

MicroRNAs (miRNAs) are short non-coding RNAs involved in various cellular processes, playing a crucial role in gene regulation. Identifying miRNA targets remains a central challenge and is pivotal for elucidating the complex gene regulatory networks. Traditional computational approaches have predominantly focused on identifying miRNA targets through perfect Watson-Crick base pairings within the seed region, referred to as canonical sites. However, emerging evidence suggests that perfect seed matches are not a prerequisite for miRNA-mediated regulation, underscoring the importance of also recognizing imperfect, or non-canonical, sites. To address this challenge, we propose Mimosa, a new computational approach that employs the Transformer framework to enhance the prediction of miRNA targets. Mimosa distinguishes itself by integrating contextual, positional and base-pairing information to capture in-depth attributes, thereby improving its predictive capabilities. Its unique ability to identify non-canonical base-pairing patterns makes Mimosa a standout model, reducing the reliance on pre-selecting candidate targets. Mimosa achieves superior performance in gene-level predictions and also shows impressive performance in site-level predictions across various non-human species through extensive benchmarking tests. To facilitate research efforts in miRNA targeting, we have developed an easy-to-use web server for comprehensive end-to-end predictions, which is publicly available at http://monash.bioweb.cloud.edu.au/Mimosa.

2.
Brief Bioinform ; 25(5)2024 Jul 25.
Artículo en Inglés | MEDLINE | ID: mdl-39258883

RESUMEN

N6-methyladenosine (m$^{6}$A) is a widely-studied methylation to messenger RNAs, which has been linked to diverse cellular processes and human diseases. Numerous databases that collate m$^{6}$A profiles of distinct cell types have been created to facilitate quick and easy mining of m$^{6}$A signatures associated with cell-specific phenotypes. However, these databases contain inherent complexities that have not been explicitly reported, which may lead to inaccurate identification and interpretation of m$^{6}$A-associated biology by end-users who are unaware of them. Here, we review various m$^{6}$A-related databases, and highlight several critical matters. In particular, differences in peak-calling pipelines across databases drive substantial variability in both peak number and coordinates with only moderate reproducibility, and the inclusion of peak calls from early m$^{6}$A sequencing protocols may lead to the reporting of false positives or negatives. The awareness of these matters will help end-users avoid the inclusion of potentially unreliable data in their studies and better utilize m$^{6}$A databases to derive biologically meaningful results.


Asunto(s)
Adenosina , Adenosina/análogos & derivados , Adenosina/metabolismo , Humanos , Bases de Datos Genéticas , ARN Mensajero/genética , ARN Mensajero/metabolismo
3.
Brief Bioinform ; 25(5)2024 Jul 25.
Artículo en Inglés | MEDLINE | ID: mdl-39276327

RESUMEN

Recent advancements in high-throughput sequencing technologies have significantly enhanced our ability to unravel the intricacies of gene regulatory processes. A critical challenge in this endeavor is the identification of variant effects, a key factor in comprehending the mechanisms underlying gene regulation. Non-coding variants, constituting over 90% of all variants, have garnered increasing attention in recent years. The exploration of gene variant impacts and regulatory mechanisms has spurred the development of various deep learning approaches, providing new insights into the global regulatory landscape through the analysis of extensive genetic data. Here, we provide a comprehensive overview of the development of the non-coding variants models based on bulk and single-cell sequencing data and their model-based interpretation and downstream tasks. This review delineates the popular sequencing technologies for epigenetic profiling and deep learning approaches for discerning the effects of non-coding variants. Additionally, we summarize the limitations of current approaches in variant effect prediction research and outline opportunities for improvement. We anticipate that our study will offer a practical and useful guide for the bioinformatic community to further advance the unraveling of genetic variant effects.


Asunto(s)
Aprendizaje Profundo , Variación Genética , Humanos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Biología Computacional/métodos , Epigénesis Genética
4.
Artículo en Inglés | MEDLINE | ID: mdl-39302773

RESUMEN

Molecular property prediction is a key component of AI-driven drug discovery and molecular characterization learning. Despite recent advances, existing methods still face challenges such as limited ability to generalize, and inadequate representation of learning from unlabeled data, especially for tasks specific to molecular structures. To address these limitations, we introduce DIG-Mol, a novel self-supervised graph neural network framework for molecular property prediction. This architecture leverages the power of contrast learning with dual interaction mechanisms and unique molecular graph enhancement strategies. DIG-Mol integrates a momentum distillation network with two interconnected networks to efficiently improve molecular characterization. The framework's ability to extract key information about molecular structure and higher-order semantics is supported by minimizing loss of contrast. We have established DIG-Mol's state-of-the-art performance through extensive experimental evaluation in a variety of molecular property prediction tasks. In addition to demonstrating superior transferability in a small number of learning scenarios, our visualizations highlight DIG-Mol's enhanced interpretability and representation capabilities. These findings confirm the effectiveness of our approach in overcoming challenges faced by traditional methods and mark a significant advance in molecular property prediction. The code for this project is now available at https://github.com/ZeXingZ/DIG-Mol.

5.
Bioinformatics ; 40(8)2024 08 02.
Artículo en Inglés | MEDLINE | ID: mdl-39133151

RESUMEN

MOTIVATION: The asymmetrical distribution of expressed mRNAs tightly controls the precise synthesis of proteins within human cells. This non-uniform distribution, a cornerstone of developmental biology, plays a pivotal role in numerous cellular processes. To advance our comprehension of gene regulatory networks, it is essential to develop computational tools for accurately identifying the subcellular localizations of mRNAs. However, considering multi-localization phenomena remains limited in existing approaches, with none considering the influence of RNA's secondary structure. RESULTS: In this study, we propose Allocator, a multi-view parallel deep learning framework that seamlessly integrates the RNA sequence-level and structure-level information, enhancing the prediction of mRNA multi-localization. The Allocator models equip four efficient feature extractors, each designed to handle different inputs. Two are tailored for sequence-based inputs, incorporating multilayer perceptron and multi-head self-attention mechanisms. The other two are specialized in processing structure-based inputs, employing graph neural networks. Benchmarking results underscore Allocator's superiority over state-of-the-art methods, showcasing its strength in revealing intricate localization associations. AVAILABILITY AND IMPLEMENTATION: The webserver of Allocator is available at http://Allocator.unimelb-biotools.cloud.edu.au; the source code and datasets are available on GitHub (https://github.com/lifuyi774/Allocator) and Zenodo (https://doi.org/10.5281/zenodo.13235798).


Asunto(s)
Biología Computacional , Redes Neurales de la Computación , ARN Mensajero , ARN Mensajero/metabolismo , ARN Mensajero/genética , Humanos , Biología Computacional/métodos , Conformación de Ácido Nucleico , Aprendizaje Profundo , Programas Informáticos
6.
Mar Drugs ; 22(8)2024 Jul 28.
Artículo en Inglés | MEDLINE | ID: mdl-39195462

RESUMEN

The direct enzymatic conversion of untreated waste shrimp and crab shells has been a key problem that plagues the large-scale utilization of chitin biological resources. The microorganisms in soil samples were enriched in two stages with powdered chitin (CP) and shrimp shell powder (SSP) as substrates. The enrichment microbiota XHQ10 with SSP degradation ability was obtained. The activities of chitinase and lytic polysaccharide monooxygenase of XHQ10 were 1.46 and 54.62 U/mL. Metagenomic analysis showed that Chitinolyticbacter meiyuanensis, Chitiniphilus shinanonensis, and Chitinimonas koreensis, with excellent chitin degradation performance, were highly enriched in XHQ10. Chitin oligosaccharides (CHOSs) are produced by XHQ10 through enzyme induction and two-stage temperature control technology, which contains CHOSs with a degree of polymerization (DP) more significant than ten and has excellent antioxidant activity. This work is the first study on the direct enzymatic preparation of CHOSs from SSP using enrichment microbiota, which provides a new path for the large-scale utilization of chitin bioresources.


Asunto(s)
Exoesqueleto , Quitina , Quitinasas , Microbiota , Oligosacáridos , Quitina/química , Animales , Oligosacáridos/química , Quitinasas/metabolismo , Exoesqueleto/química , Metagenómica/métodos , Temperatura , Polimerizacion , Bacterias
7.
Theranostics ; 14(10): 3945-3962, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38994035

RESUMEN

Rationale: NLRP3 inflammasome is critical in the development and progression of many metabolic diseases driven by chronic inflammation, but its effect on the pathology of postmenopausal osteoporosis (PMOP) remains poorly understood. Methods: We here firstly examined the levels of NLRP3 inflammasome in PMOP patients by ELISA. Then we investigated the possible mechanisms underlying the effect of NLRP3 inflammasome on PMOP by RNA sequencing of osteoblasts treated with NLRP3 siRNA and qPCR. Lastly, we accessed the effect of decreased NLRP3 levels on ovariectomized (OVX) rats. To specifically deliver NLRP3 siRNA to osteoblasts, we constructed NLRP3 siRNA wrapping osteoblast-specific aptamer (CH6)-functionalized lipid nanoparticles (termed as CH6-LNPs-siNLRP3). Results: We found that the levels of NLRP3 inflammasome were significantly increased in patients with PMOP, and were negatively correlated with estradiol levels. NLRP3 knock-down influenced signal pathways including immune system process, interferon signal pathway. Notably, of the top ten up-regulated genes in NLRP3-reduced osteoblasts, nine genes (except Mx2) were enriched in immune system process, and five genes were related to interferon signal pathway. The in vitro results showed that CH6-LNPs-siNLRP3 was relatively uniform with a dimeter of 96.64 ± 16.83 nm and zeta potential of 38.37 ± 1.86 mV. CH6-LNPs-siNLRP3 did not show obvious cytotoxicity and selectively delivered siRNA to bone tissue. Moreover, CH6-LNPs-siNLRP3 stimulated osteoblast differentiation by activating ALP and enhancing osteoblast matrix mineralization. When administrated to OVX rats, CH6-LNPs-siNLRP3 promoted bone formation and bone mass, improved bone microarchitecture and mechanical properties by decreasing the levels of NLRP3, IL-1ß and IL-18 and increasing the levels of OCN and Runx2. Conclusion: NLRP3 inflammasome may be a new biomarker for PMOP diagnosis and plays a key role in the pathology of PMOP. CH6-LNPs-siNLRP3 has potential application for the treatment of PMOP.


Asunto(s)
Inflamasomas , Liposomas , Proteína con Dominio Pirina 3 de la Familia NLR , Nanopartículas , Osteoblastos , Osteoporosis Posmenopáusica , Animales , Proteína con Dominio Pirina 3 de la Familia NLR/metabolismo , Osteoblastos/efectos de los fármacos , Osteoblastos/metabolismo , Femenino , Humanos , Ratas , Inflamasomas/metabolismo , Nanopartículas/química , Osteoporosis Posmenopáusica/metabolismo , Regulación hacia Abajo/efectos de los fármacos , Ratas Sprague-Dawley , ARN Interferente Pequeño/administración & dosificación , Aptámeros de Nucleótidos/farmacología , Aptámeros de Nucleótidos/administración & dosificación , Modelos Animales de Enfermedad , Persona de Mediana Edad , Ovariectomía
8.
IEEE J Biomed Health Inform ; 28(9): 5649-5657, 2024 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-38865232

RESUMEN

The Type III Secretion Systems (T3SSs) play a pivotal role in host-pathogen interactions by mediating the secretion of type III secretion system effectors (T3SEs) into host cells. These T3SEs mimic host cell protein functions, influencing interactions between Gram-negative bacterial pathogens and their hosts. Identifying T3SEs is essential in biomedical research for comprehending bacterial pathogenesis and its implications on human cells. This study presents EDIFIER, a novel multi-channel model designed for accurate T3SE prediction. It incorporates a graph structural channel, utilizing graph convolutional networks (GCN) to capture protein 3D structural features and a sequence channel based on the ProteinBERT pre-trained model to extract the sequence context features of T3SEs. Rigorous benchmarking tests, including ablation studies and comparative analysis, validate that EDIFIER outperforms current state-of-the-art tools in T3SE prediction. To enhance EDIFIER's accessibility to the broader scientific community, we developed a webserver that is publicly accessible at http://edifier.unimelb-biotools.cloud.edu.au/. We anticipate EDIFIER will contribute to the field by providing reliable T3SE predictions, thereby advancing our understanding of host-pathogen dynamics.


Asunto(s)
Redes Neurales de la Computación , Sistemas de Secreción Tipo III , Sistemas de Secreción Tipo III/fisiología , Biología Computacional/métodos , Humanos , Proteínas Bacterianas/metabolismo , Proteínas Bacterianas/química
9.
Angew Chem Int Ed Engl ; 63(21): e202401189, 2024 05 21.
Artículo en Inglés | MEDLINE | ID: mdl-38506220

RESUMEN

This study introduces a novel approach for synthesizing Benzoxazine-centered Polychiral Polyheterocycles (BPCPHCs) via an innovative asymmetric carbene-alkyne metathesis-triggered cascade. Overcoming challenges associated with intricate stereochemistry and multiple chiral centers, the catalytic asymmetric Carbene Alkyne Metathesis-mediated Cascade (CAMC) is employed using dirhodium catalyst/Brønsted acid co-catalysis, ensuring precise stereo control as validated by X-ray crystallography. Systematic substrate scope evaluation establishes exceptional diastereo- and enantioselectivities, creating a unique library of BPCPHCs. Pharmacological exploration identifies twelve BPCPHCs as potent Nav ion channel blockers, notably compound 8 g. In vivo studies demonstrate that intrathecal injection of 8 g effectively reverses mechanical hyperalgesia associated with chemotherapy-induced peripheral neuropathy (CIPN), suggesting a promising therapeutic avenue. Electrophysiological investigations unveil the inhibitory effects of 8 g on Nav1.7 currents. Molecular docking, dynamics simulations and surface plasmon resonance (SPR) assay provide insights into the stable complex formation and favorable binding free energy of 8 g with C5aR1. This research represents a significant advancement in asymmetric CAMC for BPCPHCs and unveils BPCPHC 8 g as a promising, uniquely acting pain blocker, establishing a C5aR1-Nav1.7 connection in the context of CIPN.


Asunto(s)
Alquinos , Benzoxazinas , Metano , Metano/análogos & derivados , Metano/química , Metano/farmacología , Alquinos/química , Benzoxazinas/química , Benzoxazinas/farmacología , Benzoxazinas/síntesis química , Compuestos Heterocíclicos/química , Compuestos Heterocíclicos/farmacología , Compuestos Heterocíclicos/síntesis química , Humanos , Estereoisomerismo , Analgésicos/química , Analgésicos/farmacología , Analgésicos/síntesis química , Estructura Molecular , Catálisis , Descubrimiento de Drogas , Animales
10.
ACS Chem Neurosci ; 15(6): 1063-1073, 2024 03 20.
Artículo en Inglés | MEDLINE | ID: mdl-38449097

RESUMEN

Chronic pain is a growing global health problem affecting at least 10% of the world's population. However, current chronic pain treatments are inadequate. Voltage-gated sodium channels (Navs) play a pivotal role in regulating neuronal excitability and pain signal transmission and thus are main targets for nonopioid painkiller development, especially those preferentially expressed in dorsal root ganglial (DRG) neurons, such as Nav1.6, Nav1.7, and Nav1.8. In this study, we screened in virtual hits from dihydrobenzofuran and 3-hydroxyoxindole hybrid molecules against Navs via a veratridine (VTD)-based calcium imaging method. The results showed that one of the molecules, 3g, could inhibit VTD-induced neuronal activity significantly. Voltage clamp recordings demonstrated that 3g inhibited the total Na+ currents of DRG neurons in a concentration-dependent manner. Biophysical analysis revealed that 3g slowed the activation, meanwhile enhancing the inactivation of the Navs. Additionally, 3g use-dependently blocked Na+ currents. By combining with selective Nav inhibitors and a heterozygous expression system, we demonstrated that 3g preferentially inhibited the TTX-S Na+ currents, specifically the Nav1.7 current, other than the TTX-R Na+ currents. Molecular docking experiments implicated that 3g binds to a known allosteric site at the voltage-sensing domain IV(VSDIV) of Nav1.7. Finally, intrathecal injection of 3g significantly relieved mechanical pain behavior in the spared nerve injury (SNI) rat model, suggesting that 3g is a promising candidate for treating chronic pain.


Asunto(s)
Dolor Crónico , Indoles , Neuralgia , Ratas , Animales , Simulación del Acoplamiento Molecular , Canal de Sodio Activado por Voltaje NAV1.8 , Neuralgia/tratamiento farmacológico , Neuralgia/metabolismo , Ganglios Espinales/metabolismo
11.
Bioinform Adv ; 4(1): vbae035, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38549946

RESUMEN

Motivation: PE/PPE proteins, highly abundant in the Mycobacterium genome, play a vital role in virulence and immune modulation. Understanding their functions is key to comprehending the internal mechanisms of Mycobacterium. However, a lack of dedicated resources has limited research into PE/PPE proteins. Results: Addressing this gap, we introduce MycobactERIal PE/PPE proTeinS (MERITS), a comprehensive 3D structure database specifically designed for PE/PPE proteins. MERITS hosts 22 353 non-redundant PE/PPE proteins, encompassing details like physicochemical properties, subcellular localization, post-translational modification sites, protein functions, and measures of antigenicity, toxicity, and allergenicity. MERITS also includes data on their secondary and tertiary structure, along with other relevant biological information. MERITS is designed to be user-friendly, offering interactive search and data browsing features to aid researchers in exploring the potential functions of PE/PPE proteins. MERITS is expected to become a crucial resource in the field, aiding in developing new diagnostics and vaccines by elucidating the sequence-structure-functional relationships of PE/PPE proteins. Availability and implementation: MERITS is freely accessible at http://merits.unimelb-biotools.cloud.edu.au/.

12.
J Chem Inf Model ; 64(4): 1407-1418, 2024 02 26.
Artículo en Inglés | MEDLINE | ID: mdl-38334115

RESUMEN

Studying the effect of single amino acid variations (SAVs) on protein structure and function is integral to advancing our understanding of molecular processes, evolutionary biology, and disease mechanisms. Screening for deleterious variants is one of the crucial issues in precision medicine. Here, we propose a novel computational approach, TransEFVP, based on large-scale protein language model embeddings and a transformer-based neural network to predict disease-associated SAVs. The model adopts a two-stage architecture: the first stage is designed to fuse different feature embeddings through a transformer encoder. In the second stage, a support vector machine model is employed to quantify the pathogenicity of SAVs after dimensionality reduction. The prediction performance of TransEFVP on blind test data achieves a Matthews correlation coefficient of 0.751, an F1-score of 0.846, and an area under the receiver operating characteristic curve of 0.871, higher than the existing state-of-the-art methods. The benchmark results demonstrate that TransEFVP can be explored as an accurate and effective SAV pathogenicity prediction method. The data and codes for TransEFVP are available at https://github.com/yzh9607/TransEFVP/tree/master for academic use.


Asunto(s)
Algoritmos , Proteínas , Humanos , Proteínas/química , Secuencia de Aminoácidos , Redes Neurales de la Computación , Aminoácidos
13.
Environ Sci Technol ; 58(10): 4662-4669, 2024 Mar 12.
Artículo en Inglés | MEDLINE | ID: mdl-38422482

RESUMEN

Since the mass production and extensive use of chloroquine (CLQ) would lead to its inevitable discharge, wastewater treatment plants (WWTPs) might play a key role in the management of CLQ. Despite the reported functional versatility of ammonia-oxidizing bacteria (AOB) that mediate the first step for biological nitrogen removal at WWTP (i.e., partial nitrification), their potential capability to degrade CLQ remains to be discovered. Therefore, with the enriched partial nitrification sludge, a series of dedicated batch tests were performed in this study to verify the performance and mechanisms of CLQ biodegradation under the ammonium conditions of mainstream wastewater. The results showed that AOB could degrade CLQ in the presence of ammonium oxidation activity, but the capability was limited by the amount of partial nitrification sludge (∼1.1 mg/L at a mixed liquor volatile suspended solids concentration of 200 mg/L). CLQ and its biodegradation products were found to have no significant effect on the ammonium oxidation activity of AOB while the latter would promote N2O production through the AOB denitrification pathway, especially at relatively low DO levels (≤0.5 mg-O2/L). This study provided valuable insights into a more comprehensive assessment of the fate of CLQ in the context of wastewater treatment.


Asunto(s)
Amoníaco , Compuestos de Amonio , Amoníaco/metabolismo , Aguas del Alcantarillado/microbiología , Bacterias/metabolismo , Reactores Biológicos/microbiología , Oxidación-Reducción , Óxido Nitroso/análisis , Nitrificación , Compuestos de Amonio/metabolismo
14.
Artículo en Inglés | MEDLINE | ID: mdl-38190667

RESUMEN

Origins of replication sites (ORIs) are crucial genomic regions where DNA replication initiation takes place, playing pivotal roles in fundamental biological processes like cell division, gene expression regulation, and DNA integrity. Accurate identification of ORIs is essential for comprehending cell replication, gene expression, and mutation-related diseases. However, experimental approaches for ORI identification are often expensive and time-consuming, leading to the growing popularity of computational methods. In this study, we present PLANNER (DeeP LeArNiNg prEdictor for ORI), a novel approach for species-specific and cell-specific prediction of eukaryotic ORIs. PLANNER uses the multi-scale ktuple sequences as input and employs the DNABERT pre-training model with transfer learning and ensemble learning strategies to train accurate predictive models. Extensive empirical test results demonstrate that PLANNER achieved superior predictive performance compared to state-of-the-art approaches, including iOri-Euk, Stack-ORI, and ORI-Deep, within specific cell types and across different cell types. Furthermore, by incorporating an interpretable analysis mechanism, we provide insights into the learned patterns, facilitating the mapping from discovering important sequential determinants to comprehensively analysing their biological functions. To facilitate the widespread utilisation of PLANNER, we developed an online webserver and local stand-alone software, available at http://planner.unimelb-biotools.cloud.edu.au/ and https://github.com/CongWang3/PLANNER, respectively.

15.
Comput Biol Med ; 168: 107681, 2024 01.
Artículo en Inglés | MEDLINE | ID: mdl-37992470

RESUMEN

The multidrug-resistant Gram-negative bacteria has evolved into a worldwide threat to human health; over recent decades, polymyxins have re-emerged in clinical practice due to their high activity against multidrug-resistant bacteria. Nevertheless, the nephrotoxicity and neurotoxicity of polymyxins seriously hinder their practical use in the clinic. Based on the quantitative structure-activity relationship (QSAR), analogue design is an efficient strategy for discovering biologically active compounds with fewer adverse effects. To accelerate the polymyxin analogues discovery process and find the polymyxin analogues with high antimicrobial activity against Gram-negative bacteria, here we developed PmxPred, a GCN and catBoost-based machine learning framework. The RDKit descriptors were used for the molecule and residues representation, and the ensemble learning model was utilized for the antimicrobial activity prediction. This framework was trained and evaluated on multiple Gram-negative bacteria datasets, including Acinetobacter baumannii, Escherichia coli, Klebsiella pneumoniae, Pseudomonas aeruginosa and a general Gram-negative bacteria dataset achieving an AUROC of 0.857, 0.880, 0.756, 0.895 and 0.865 on the independent test, respectively. PmxPred outperformed the transfer learning method that trained on 10 million molecules. We interpreted our model well-trained model by analysing the importance of global and residue features. Overall, PmxPred provides a powerful additional tool for predicting active polymyxin analogues, and holds the potential elucidate the mechanisms underlying the antimicrobial activity of polymyxins. The source code is publicly available on GitHub (https://github.com/yanwu20/PmxPred).


Asunto(s)
Infecciones por Bacterias Gramnegativas , Polimixinas , Humanos , Polimixinas/farmacología , Polimixinas/química , Antibacterianos/química , Infecciones por Bacterias Gramnegativas/tratamiento farmacológico , Infecciones por Bacterias Gramnegativas/microbiología , Bacterias Gramnegativas , Farmacorresistencia Bacteriana Múltiple , Escherichia coli , Pruebas de Sensibilidad Microbiana
16.
Brief Bioinform ; 24(6)2023 09 22.
Artículo en Inglés | MEDLINE | ID: mdl-37874948

RESUMEN

Proteases contribute to a broad spectrum of cellular functions. Given a relatively limited amount of experimental data, developing accurate sequence-based predictors of substrate cleavage sites facilitates a better understanding of protease functions and substrate specificity. While many protease-specific predictors of substrate cleavage sites were developed, these efforts are outpaced by the growth of the protease substrate cleavage data. In particular, since data for 100+ protease types are available and this number continues to grow, it becomes impractical to publish predictors for new protease types, and instead it might be better to provide a computational platform that helps users to quickly and efficiently build predictors that address their specific needs. To this end, we conceptualized, developed, tested and released a versatile bioinformatics platform, ProsperousPlus, that empowers users, even those with no programming or little bioinformatics background, to build fast and accurate predictors of substrate cleavage sites. ProsperousPlus facilitates the use of the rapidly accumulating substrate cleavage data to train, empirically assess and deploy predictive models for user-selected substrate types. Benchmarking tests on test datasets show that our platform produces predictors that on average exceed the predictive performance of current state-of-the-art approaches. ProsperousPlus is available as a webserver and a stand-alone software package at http://prosperousplus.unimelb-biotools.cloud.edu.au/.


Asunto(s)
Aprendizaje Automático , Péptido Hidrolasas , Péptido Hidrolasas/metabolismo , Especificidad por Sustrato , Algoritmos
17.
Brief Bioinform ; 24(4)2023 07 20.
Artículo en Inglés | MEDLINE | ID: mdl-37369638

RESUMEN

Antimicrobial peptides (AMPs) are short peptides that play crucial roles in diverse biological processes and have various functional activities against target organisms. Due to the abuse of chemical antibiotics and microbial pathogens' increasing resistance to antibiotics, AMPs have the potential to be alternatives to antibiotics. As such, the identification of AMPs has become a widely discussed topic. A variety of computational approaches have been developed to identify AMPs based on machine learning algorithms. However, most of them are not capable of predicting the functional activities of AMPs, and those predictors that can specify activities only focus on a few of them. In this study, we first surveyed 10 predictors that can identify AMPs and their functional activities in terms of the features they employed and the algorithms they utilized. Then, we constructed comprehensive AMP datasets and proposed a new deep learning-based framework, iAMPCN (identification of AMPs based on CNNs), to identify AMPs and their related 22 functional activities. Our experiments demonstrate that iAMPCN significantly improved the prediction performance of AMPs and their corresponding functional activities based on four types of sequence features. Benchmarking experiments on the independent test datasets showed that iAMPCN outperformed a number of state-of-the-art approaches for predicting AMPs and their functional activities. Furthermore, we analyzed the amino acid preferences of different AMP activities and evaluated the model on datasets of varying sequence redundancy thresholds. To facilitate the community-wide identification of AMPs and their corresponding functional types, we have made the source codes of iAMPCN publicly available at https://github.com/joy50706/iAMPCN/tree/master. We anticipate that iAMPCN can be explored as a valuable tool for identifying potential AMPs with specific functional activities for further experimental validation.


Asunto(s)
Péptidos Catiónicos Antimicrobianos , Aprendizaje Profundo , Péptidos Catiónicos Antimicrobianos/farmacología , Péptidos Antimicrobianos , Antibacterianos , Algoritmos
18.
Comput Biol Med ; 163: 107155, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-37356289

RESUMEN

The genome of Mycobacterium tuberculosis contains a relatively high percentage (10%) of genes that are poorly characterised because of their highly repetitive nature and high GC content. Some of these genes encode proteins of the PE/PPE family, which are thought to be involved in host-pathogen interactions, virulence, and disease pathogenicity. Members of this family are genetically divergent and challenging to both identify and classify using conventional computational tools. Thus, advanced in silico methods are needed to identify proteins of this family for subsequent functional annotation efficiently. In this study, we developed the first deep learning-based approach, termed Digerati, for the rapid and accurate identification of PE and PPE family proteins. Digerati was built upon a multipath parallel hybrid deep learning framework, which equips multi-layer convolutional neural networks with bidirectional, long short-term memory, equipped with a self-attention module to effectively learn the higher-order feature representations of PE/PPE proteins. Empirical studies demonstrated that Digerati achieved a significantly better performance (∼18-20%) than alignment-based approaches, including BLASTP, PHMMER, and HHsuite, in both prediction accuracy and speed. Digerati is anticipated to facilitate community-wide efforts to conduct high-throughput identification and analysis of PE/PPE family members. The webserver and source codes of Digerati are publicly available at http://web.unimelb-bioinfortools.cloud.edu.au/Digerati/.


Asunto(s)
Aprendizaje Profundo , Mycobacterium tuberculosis , Mycobacterium tuberculosis/genética , Mycobacterium tuberculosis/metabolismo , Proteínas Bacterianas/genética , Virulencia/genética
19.
IEEE/ACM Trans Comput Biol Bioinform ; 20(5): 3205-3214, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37289599

RESUMEN

It has been demonstrated that RNA modifications play essential roles in multiple biological processes. Accurate identification of RNA modifications in the transcriptome is critical for providing insights into the biological functions and mechanisms. Many tools have been developed for predicting RNA modifications at single-base resolution, which employ conventional feature engineering methods that focus on feature design and feature selection processes that require extensive biological expertise and may introduce redundant information. With the rapid development of artificial intelligence technologies, end-to-end methods are favorably received by researchers. Nevertheless, each well-trained model is only suitable for a specific RNA methylation modification type for nearly all of these approaches. In this study, we present MRM-BERT by feeding task-specific sequences into the powerful BERT (Bidirectional Encoder Representations from Transformers) model and implementing fine-tuning, which exhibits competitive performance to the state-of-the-art methods. MRM-BERT avoids repeated de novo training of the model and can predict multiple RNA modifications such as pseudouridine, m6A, m5C, and m1A in Mus musculus, Arabidopsis thaliana, and Saccharomyces cerevisiae. In addition, we analyse the attention heads to provide high attention regions for the prediction, and conduct saturated in silico mutagenesis of the input sequences to discover potential changes of RNA modifications, which can better assist researchers in their follow-up research.


Asunto(s)
Arabidopsis , Inteligencia Artificial , Ratones , Animales , Seudouridina , Arabidopsis/genética , Transcriptoma , Saccharomyces cerevisiae/genética , ARN/genética
20.
Brief Bioinform ; 24(4)2023 07 20.
Artículo en Inglés | MEDLINE | ID: mdl-37291763

RESUMEN

BACKGROUND: Promoters are DNA regions that initiate the transcription of specific genes near the transcription start sites. In bacteria, promoters are recognized by RNA polymerases and associated sigma factors. Effective promoter recognition is essential for synthesizing the gene-encoded products by bacteria to grow and adapt to different environmental conditions. A variety of machine learning-based predictors for bacterial promoters have been developed; however, most of them were designed specifically for a particular species. To date, only a few predictors are available for identifying general bacterial promoters with limited predictive performance. RESULTS: In this study, we developed TIMER, a Siamese neural network-based approach for identifying both general and species-specific bacterial promoters. Specifically, TIMER uses DNA sequences as the input and employs three Siamese neural networks with the attention layers to train and optimize the models for a total of 13 species-specific and general bacterial promoters. Extensive 10-fold cross-validation and independent tests demonstrated that TIMER achieves a competitive performance and outperforms several existing methods on both general and species-specific promoter prediction. As an implementation of the proposed method, the web server of TIMER is publicly accessible at http://web.unimelb-bioinfortools.cloud.edu.au/TIMER/.


Asunto(s)
Bacterias , Redes Neurales de la Computación , Bacterias/genética , Bacterias/metabolismo , ARN Polimerasas Dirigidas por ADN/genética , ARN Polimerasas Dirigidas por ADN/metabolismo , Secuencia de Bases , Regiones Promotoras Genéticas
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA