Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 44
Filtrar
Más filtros











Base de datos
Intervalo de año de publicación
1.
Brief Bioinform ; 25(5)2024 Jul 25.
Artículo en Inglés | MEDLINE | ID: mdl-39293807

RESUMEN

Cancer is a severe illness that significantly threatens human life and health. Anticancer peptides (ACPs) represent a promising therapeutic strategy for combating cancer. In silico methods enable rapid and accurate identification of ACPs without extensive human and material resources. This study proposes a two-stage computational framework called ACP-CapsPred, which can accurately identify ACPs and characterize their functional activities across different cancer types. ACP-CapsPred integrates a protein language model with evolutionary information and physicochemical properties of peptides, constructing a comprehensive profile of peptides. ACP-CapsPred employs a next-generation neural network, specifically capsule networks, to construct predictive models. Experimental results demonstrate that ACP-CapsPred exhibits satisfactory predictive capabilities in both stages, reaching state-of-the-art performance. In the first stage, ACP-CapsPred achieves accuracies of 80.25% and 95.71%, as well as F1-scores of 79.86% and 95.90%, on benchmark datasets Set 1 and Set 2, respectively. In the second stage, tasked with characterizing the functional activities of ACPs across five selected cancer types, ACP-CapsPred attains an average accuracy of 90.75% and an F1-score of 91.38%. Furthermore, ACP-CapsPred demonstrates excellent interpretability, revealing regions and residues associated with anticancer activity. Consequently, ACP-CapsPred presents a promising solution to expedite the development of ACPs and offers a novel perspective for other biological sequence analyses.


Asunto(s)
Antineoplásicos , Biología Computacional , Redes Neurales de la Computación , Péptidos , Humanos , Antineoplásicos/química , Antineoplásicos/farmacología , Péptidos/química , Biología Computacional/métodos , Neoplasias/tratamiento farmacológico , Neoplasias/metabolismo , Bases de Datos de Proteínas
2.
iScience ; 27(9): 110718, 2024 Sep 20.
Artículo en Inglés | MEDLINE | ID: mdl-39262770

RESUMEN

The rise of antibiotic resistance necessitates effective alternative therapies. Antimicrobial peptides (AMPs) are promising due to their broad inhibitory effects. This study focuses on predicting the minimum inhibitory concentration (MIC) of AMPs against whom-priority pathogens: Staphylococcus aureus ATCC 25923, Escherichia coli ATCC 25922, and Pseudomonas aeruginosa ATCC 27853. We developed a comprehensive regression model integrating AMP sequence-based and genomic features. Using eight AI-based architectures, including deep learning with protein language model embeddings, we created an ensemble model combining bi-directional long short-term memory (BiLSTM), convolutional neural network (CNN), and multi-branch model (MBM). The ensemble model showed superior performance with Pearson correlation coefficients of 0.756, 0.781, and 0.802 for the bacterial strains, demonstrating its accuracy in predicting MIC values. This work sets a foundation for future studies to enhance model performance and advance AMP applications in combating antibiotic resistance.

3.
J Chem Inf Model ; 64(14): 5725-5736, 2024 Jul 22.
Artículo en Inglés | MEDLINE | ID: mdl-38946113

RESUMEN

Enhancers are a class of noncoding DNA, serving as crucial regulatory elements in governing gene expression by binding to transcription factors. The identification of enhancers holds paramount importance in the field of biology. However, traditional experimental methods for enhancer identification demand substantial human and material resources. Consequently, there is a growing interest in employing computational methods for enhancer prediction. In this study, we propose a two-stage framework based on deep learning, termed CapsEnhancer, for the identification of enhancers and their strengths. CapsEnhancer utilizes chaos game representation to encode DNA sequences into unique images and employs a capsule network to extract local and global features from sequence "images". Experimental results demonstrate that CapsEnhancer achieves state-of-the-art performance in both stages. In the first and second stages, the accuracy surpasses the previous best methods by 8 and 3.5%, reaching accuracies of 94.5 and 95%, respectively. Notably, this study represents the pioneering application of computer vision methods to enhancer identification tasks. Our work not only contributes novel insights to enhancer identification but also provides a fresh perspective for other biological sequence analysis tasks.


Asunto(s)
Biología Computacional , Elementos de Facilitación Genéticos , Biología Computacional/métodos , Humanos , Dinámicas no Lineales , Aprendizaje Profundo
4.
Brief Bioinform ; 25(3)2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38706321

RESUMEN

Antiviral peptides (AVPs) have shown potential in inhibiting viral attachment, preventing viral fusion with host cells and disrupting viral replication due to their unique action mechanisms. They have now become a broad-spectrum, promising antiviral therapy. However, identifying effective AVPs is traditionally slow and costly. This study proposed a new two-stage computational framework for AVP identification. The first stage identifies AVPs from a wide range of peptides, and the second stage recognizes AVPs targeting specific families or viruses. This method integrates contrastive learning and multi-feature fusion strategy, focusing on sequence information and peptide characteristics, significantly enhancing predictive ability and interpretability. The evaluation results of the model show excellent performance, with accuracy of 0.9240 and Matthews correlation coefficient (MCC) score of 0.8482 on the non-AVP independent dataset, and accuracy of 0.9934 and MCC score of 0.9869 on the non-AMP independent dataset. Furthermore, our model can predict antiviral activities of AVPs against six key viral families (Coronaviridae, Retroviridae, Herpesviridae, Paramyxoviridae, Orthomyxoviridae, Flaviviridae) and eight viruses (FIV, HCV, HIV, HPIV3, HSV1, INFVA, RSV, SARS-CoV). Finally, to facilitate user accessibility, we built a user-friendly web interface deployed at https://awi.cuhk.edu.cn/∼dbAMP/AVP/.


Asunto(s)
Antivirales , Biología Computacional , Péptidos , Antivirales/farmacología , Péptidos/química , Biología Computacional/métodos , Humanos , Virus , Aprendizaje Automático , Algoritmos
5.
Protein Sci ; 33(6): e5006, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38723168

RESUMEN

The emergence and spread of antibiotic-resistant bacteria pose a significant public health threat, necessitating the exploration of alternative antibacterial strategies. Antibacterial peptide (ABP) is a kind of antimicrobial peptide (AMP) that has the potential ability to fight against bacteria infection, offering a promising avenue for developing novel therapeutic interventions. This study introduces AMPActiPred, a three-stage computational framework designed to identify ABPs, characterize their activity against diverse bacterial species, and predict their activity levels. AMPActiPred employed multiple effective peptide descriptors to effectively capture the compositional features and physicochemical properties of peptides. AMPActiPred utilized deep forest architecture, a cascading architecture similar to deep neural networks, capable of effectively processing and exploring original features to enhance predictive performance. In the first stage, AMPActiPred focuses on ABP identification, achieving an Accuracy of 87.6% and an MCC of 0.742 on an elaborate dataset, demonstrating state-of-the-art performance. In the second stage, AMPActiPred achieved an average GMean at 82.8% in identifying ABPs targeting 10 bacterial species, indicating AMPActiPred can achieve balanced predictions regarding the functional activity of ABP across this set of species. In the third stage, AMPActiPred demonstrates robust predictive capabilities for ABP activity levels with an average PCC of 0.722. Furthermore, AMPActiPred exhibits excellent interpretability, elucidating crucial features associated with antibacterial activity. AMPActiPred is the first computational framework capable of predicting targets and activity levels of ABPs. Finally, to facilitate the utilization of AMPActiPred, we have established a user-friendly web interface deployed at https://awi.cuhk.edu.cn/∼AMPActiPred/.


Asunto(s)
Antibacterianos , Antibacterianos/farmacología , Antibacterianos/química , Péptidos Antimicrobianos/química , Péptidos Antimicrobianos/farmacología , Bacterias/efectos de los fármacos , Biología Computacional/métodos , Redes Neurales de la Computación , Pruebas de Sensibilidad Microbiana
6.
Int J Mol Sci ; 25(5)2024 Mar 01.
Artículo en Inglés | MEDLINE | ID: mdl-38474116

RESUMEN

RNA modification plays a crucial role in cellular regulation. However, traditional high-throughput sequencing methods for elucidating their functional mechanisms are time-consuming and labor-intensive, despite extensive research. Moreover, existing methods often limit their focus to specific species, neglecting the simultaneous exploration of RNA modifications across diverse species. Therefore, a versatile computational approach is necessary for interpretable analysis of RNA modifications across species. A multi-scale biological language-based deep learning model is proposed for interpretable, sequential-level prediction of diverse RNA modifications. Benchmark comparisons across species demonstrate the model's superiority in predicting various RNA methylation types over current state-of-the-art methods. The cross-species validation and attention weight visualization also highlight the model's capability to capture sequential and functional semantics from genomic backgrounds. Our analysis of RNA modifications helps us find the potential existence of "biological grammars" in each modification type, which could be effective for mapping methylation-related sequential patterns and understanding the underlying biological mechanisms of RNA modifications.


Asunto(s)
Aprendizaje Profundo , ARN , ARN/genética , Metilación de ARN , Metilación , Procesamiento Proteico-Postraduccional
7.
iScience ; 27(4): 109333, 2024 Apr 19.
Artículo en Inglés | MEDLINE | ID: mdl-38523792

RESUMEN

Kinases as important enzymes can transfer phosphate groups from high-energy and phosphate-donating molecules to specific substrates and play essential roles in various cellular processes. Existing algorithms for kinase activity from phosphorylated proteomics data are often costly, requiring valuable samples. Moreover, methods to extract kinase activities from bulk RNA sequencing data remain undeveloped. In this study, we propose a computational framework KinPred-RNA to derive kinase activities from bulk RNA-sequencing data in cancer samples. KinPred-RNA framework, using the extreme gradient boosting (XGBoost) regression model, outperforms random forest regression, multiple linear regression, and support vector machine regression models in predicting kinase activities from cancer-related RNA sequencing data. Efficient gene signatures from the LINCS-L1000 dataset were used as inputs for KinPred-RNA. The results highlight its potential to be related to biological function. In conclusion, KinPred RNA constitutes a significant advance in cancer research by potentially facilitating the identification of cancer.

8.
Anal Chem ; 96(4): 1538-1546, 2024 01 30.
Artículo en Inglés | MEDLINE | ID: mdl-38226973

RESUMEN

Tuberculosis (TB) is a severe disease caused by Mycobacterium tuberculosis that poses a significant threat to human health. The emergence of drug-resistant strains has made the global fight against TB even more challenging. Antituberculosis peptides (ATPs) have shown promising results as a potential treatment for TB. However, conventional wet lab-based approaches to ATP discovery are time-consuming and costly and often fail to discover peptides with desired properties. To address these challenges, we propose a novel machine learning-based framework called ATPfinder that can significantly accelerate the discovery of ATP. Our approach integrates various efficient peptide descriptors and utilizes the deep forest algorithm to construct the model. This neural network-like cascading structure can effectively process and mine features without complex hyperparameter tuning. Our experimental results show that ATPfinder outperforms existing ATP prediction tools, achieving state-of-the-art performance with an accuracy of 89.3% and an MCC of 0.70. Moreover, our framework exhibits better robustness than baseline algorithms commonly used for other sequence analysis tasks. Additionally, the excellent interpretability of our model can assist researchers in understanding the critical features of ATP. Finally, we developed a downloadable desktop application to simplify the use of our framework for researchers. Therefore, ATPfinder can facilitate the discovery of peptide drugs and provide potential solutions for TB treatment. Our framework is freely available at https://github.com/lantianyao/ATPfinder/ (data sets and code) and https://awi.cuhk.edu.cn/dbAMP/ATPfinder.html (software).


Asunto(s)
Mycobacterium tuberculosis , Tuberculosis , Humanos , Péptidos/farmacología , Antituberculosos/farmacología , Algoritmos , Tuberculosis/tratamiento farmacológico , Bosques , Adenosina Trifosfato
9.
Artículo en Inglés | MEDLINE | ID: mdl-38059128

RESUMEN

Methamphetamine use disorder (MUD) is an illness associated with severe health consequences. Virtual reality (VR) is used to induce the drug-cue reactivity and significant EEG and ECG abnormalities were found in MUD patients. However, whether a link exists between EEG and ECG abnormalities in patients with MUD during exposure to drug cues remains unknown. This is important from the therapeutic viewpoint because different treatment strategies may be applied when EEG abnormalities and ECG irregularities are complications of MUD. We designed a VR system with drug cues and EEG and ECG were recorded during VR exposure. Sixteen patients with MUD and sixteen healthy subjects were recruited. Statistical tests and Pearson correlation were employed to analyze the EEG and ECG. The results showed that, during VR induction, the patients with MUD but not healthy controls showed significant [Formula: see text] and [Formula: see text] power increases when the stimulus materials were most intense. This finding indicated that the stimuli are indiscriminate to healthy controls but meaningful to patients with MUD. Five heart rate variability (HRV) indexes significantly differed between patients and controls, suggesting abnormalities in the reaction of patient's autonomic nervous system. Importantly, significant relations between EEG and HRV indexes changes were only identified in the controls, but not in MUD patients, signifying a disruption of brain-heart relations in patients. Our findings of stimulus-specific EEG changes and the impaired brain-heart relations in patients with MUD shed light on the understanding of drug-cue reactivity and may be used to design diagnostic and/or therapeutic strategies for MUD.


Asunto(s)
Metanfetamina , Realidad Virtual , Humanos , Metanfetamina/efectos adversos , Señales (Psicología) , Encéfalo , Frecuencia Cardíaca/fisiología
10.
J Chem Inf Model ; 63(24): 7886-7898, 2023 Dec 25.
Artículo en Inglés | MEDLINE | ID: mdl-38054927

RESUMEN

Inflammation is a biological response to harmful stimuli, aiding in the maintenance of tissue homeostasis. However, excessive or persistent inflammation can precipitate a myriad of pathological conditions. Although current treatments such as NSAIDs, corticosteroids, and immunosuppressants are effective, they can have side effects and resistance issues. In this backdrop, anti-inflammatory peptides (AIPs) have emerged as a promising therapeutic approach against inflammation. Leveraging machine learning methods, we have the opportunity to accelerate the discovery and investigation of these AIPs more effectively. In this study, we proposed an advanced framework by ensemble machine learning and deep learning for AIP prediction. Initially, we constructed three individual models with extremely randomized trees (ET), gated recurrent unit (GRU), and convolutional neural networks (CNNs) with attention mechanism and then used stacking architecture to build the final predictor. By utilizing various sequence encodings and combining the strengths of different algorithms, our predictor demonstrated exemplary performance. On our independent test set, our model achieved an accuracy, MCC, and F1-score of 0.757, 0.500, and 0.707, respectively, clearly outperforming other contemporary AIP prediction methods. Additionally, our model offers profound insights into the feature interpretation of AIPs, establishing a valuable knowledge foundation for the design and development of future anti-inflammatory strategies.


Asunto(s)
Aprendizaje Profundo , Humanos , Antiinflamatorios/farmacología , Antiinflamatorios/uso terapéutico , Péptidos/farmacología , Inflamación/tratamiento farmacológico , Algoritmos , Aprendizaje Automático
11.
iScience ; 26(12): 108250, 2023 Dec 15.
Artículo en Inglés | MEDLINE | ID: mdl-38025779

RESUMEN

The challenge of drug-resistant bacteria to global public health has led to increased attention on antimicrobial peptides (AMPs) as a targeted therapeutic alternative with a lower risk of resistance. However, high production costs and limitations in functional class prediction have hindered progress in this field. In this study, we used multi-label classifiers with binary relevance and algorithm adaptation techniques to predict different functions of AMPs across a wide range of pathogen categories, including bacteria, mammalian cells, fungi, viruses, and cancer cells. Our classifiers attained promising AUC scores varying from 0.8492 to 0.9126 on independent testing data. Forward feature selection identified sequence order and charge as critical, with specific amino acids (C and E) as discriminative. These findings provide valuable insights for the design of antimicrobial peptides (AMPs) with multiple functionalities, thus contributing to the broader effort to combat drug-resistant pathogens.

12.
Viruses ; 15(10)2023 09 27.
Artículo en Inglés | MEDLINE | ID: mdl-37896791

RESUMEN

Cervical cancer, a major health concern among women worldwide, is closely linked to human papillomavirus (HPV) infection. This study explores the evolving landscape of HPV molecular epidemiology in Taiwan over a decade (2010-2020), where prophylactic HPV vaccination has been implemented since 2007. Analyzing data from 40,561 vaginal swab samples, with 42.0% testing positive for HPV, we reveal shifting trends in HPV genotype distribution and infection patterns. The 12 high-risk genotypes, in order of decreasing percentage, were HPV 52, 58, 16, 18, 51, 56, 39, 59, 33, 31, 45, and 35. The predominant genotypes were HPV 52, 58, and 16, accounting for over 70% of cases annually. The proportions of high-risk and non-high-risk HPV infections varied across age groups. High-risk infections predominated in sexually active individuals aged 30-50 and were mixed-type infections. The composition of high-risk HPV genotypes was generally stable over time; however, HPV31, 33, 39, and 51 significantly decreased over the decade. Of the strains, HPV31 and 33 are shielded by the nonavalent HPV vaccine. However, no reduction was noted for the other seven genotypes. This study offers valuable insights into the post-vaccine HPV epidemiology. Future investigations should delve into HPV vaccines' effects and their implications for cervical cancer prevention strategies. These findings underscore the need for continued surveillance and research to guide effective public health interventions targeting HPV-associated diseases.


Asunto(s)
Infecciones por Papillomavirus , Vacunas contra Papillomavirus , Neoplasias del Cuello Uterino , Humanos , Femenino , Neoplasias del Cuello Uterino/epidemiología , Neoplasias del Cuello Uterino/prevención & control , Virus del Papiloma Humano , Infecciones por Papillomavirus/epidemiología , Infecciones por Papillomavirus/prevención & control , Epidemiología Molecular , Papillomaviridae/genética , Genotipo , Papillomavirus Humano 31/genética , Prevalencia
13.
Brief Bioinform ; 24(6)2023 09 22.
Artículo en Inglés | MEDLINE | ID: mdl-37742050

RESUMEN

The emergence of multidrug-resistant bacteria is a critical global crisis that poses a serious threat to public health, particularly with the rise of multidrug-resistant Staphylococcus aureus. Accurate assessment of drug resistance is essential for appropriate treatment and prevention of transmission of these deadly pathogens. Early detection of drug resistance in patients is critical for providing timely treatment and reducing the spread of multidrug-resistant bacteria. This study aims to develop a novel risk assessment framework for S. aureus that can accurately determine the resistance to multiple antibiotics. The comprehensive 7-year study involved ˃20 000 isolates with susceptibility testing profiles of six antibiotics. By incorporating mass spectrometry and machine learning, the study was able to predict the susceptibility to four different antibiotics with high accuracy. To validate the accuracy of our models, we externally tested on an independent cohort and achieved impressive results with an area under the receiver operating characteristic curve of 0. 94, 0.90, 0.86 and 0.91, and an area under the precision-recall curve of 0.93, 0.87, 0.87 and 0.81, respectively, for oxacillin, clindamycin, erythromycin and trimethoprim-sulfamethoxazole. In addition, the framework evaluated the level of multidrug resistance of the isolates by using the predicted drug resistance probabilities, interpreting them in the context of a multidrug resistance risk score and analyzing the performance contribution of different sample groups. The results of this study provide an efficient method for early antibiotic decision-making and a better understanding of the multidrug resistance risk of S. aureus.


Asunto(s)
Staphylococcus aureus Resistente a Meticilina , Infecciones Estafilocócicas , Humanos , Staphylococcus aureus , Infecciones Estafilocócicas/tratamiento farmacológico , Infecciones Estafilocócicas/microbiología , Antibacterianos/farmacología , Espectrometría de Masa por Láser de Matriz Asistida de Ionización Desorción/métodos , Aprendizaje Automático , Medición de Riesgo
14.
Protein Sci ; 32(10): e4758, 2023 10.
Artículo en Inglés | MEDLINE | ID: mdl-37595093

RESUMEN

Fungal infections have become a significant global health issue, affecting millions worldwide. Antifungal peptides (AFPs) have emerged as a promising alternative to conventional antifungal drugs due to their low toxicity and low propensity for inducing resistance. In this study, we developed a deep learning-based framework called DeepAFP to efficiently identify AFPs. DeepAFP fully leverages and mines composition information, evolutionary information, and physicochemical properties of peptides by employing combined kernels from multiple branches of convolutional neural network with bi-directional long short-term memory layers. In addition, DeepAFP integrates a transfer learning strategy to obtain efficient representations of peptides for improving model performance. DeepAFP demonstrates strong predictive ability on carefully curated datasets, yielding an accuracy of 93.29% and an F1-score of 93.45% on the DeepAFP-Main dataset. The experimental results show that DeepAFP outperforms existing AFP prediction tools, achieving state-of-the-art performance. Finally, we provide a downloadable AFP prediction tool to meet the demands of large-scale prediction and facilitate the usage of our framework by the public or other researchers. Our framework can accurately identify AFPs in a short time without requiring significant human and material resources, and hence can accelerate the development of AFPs as well as contribute to the treatment of fungal infections. Furthermore, our method can provide new perspectives for other biological sequence analysis tasks.


Asunto(s)
Aprendizaje Profundo , Micosis , Humanos , Algoritmos , Antifúngicos/farmacología , alfa-Fetoproteínas , Péptidos/farmacología , Péptidos/química
15.
Int J Mol Sci ; 24(12)2023 Jun 19.
Artículo en Inglés | MEDLINE | ID: mdl-37373494

RESUMEN

One of the major challenges in cancer therapy lies in the limited targeting specificity exhibited by existing anti-cancer drugs. Tumor-homing peptides (THPs) have emerged as a promising solution to this issue, due to their capability to specifically bind to and accumulate in tumor tissues while minimally impacting healthy tissues. THPs are short oligopeptides that offer a superior biological safety profile, with minimal antigenicity, and faster incorporation rates into target cells/tissues. However, identifying THPs experimentally, using methods such as phage display or in vivo screening, is a complex, time-consuming task, hence the need for computational methods. In this study, we proposed StackTHPred, a novel machine learning-based framework that predicts THPs using optimal features and a stacking architecture. With an effective feature selection algorithm and three tree-based machine learning algorithms, StackTHPred has demonstrated advanced performance, surpassing existing THP prediction methods. It achieved an accuracy of 0.915 and a 0.831 Matthews Correlation Coefficient (MCC) score on the main dataset, and an accuracy of 0.883 and a 0.767 MCC score on the small dataset. StackTHPred also offers favorable interpretability, enabling researchers to better understand the intrinsic characteristics of THPs. Overall, StackTHPred is beneficial for both the exploration and identification of THPs and facilitates the development of innovative cancer therapies.


Asunto(s)
Neoplasias , Péptidos , Humanos , Péptidos/metabolismo , Oligopéptidos , Algoritmos , Aprendizaje Automático
16.
Artículo en Inglés | MEDLINE | ID: mdl-37022368

RESUMEN

Early diagnosis and treatment can reduce the symptoms of Attention Deficit/Hyperactivity Disorder (ADHD) in children, but medical diagnosis is usually delayed. Hence, it is important to increase the efficiency of early diagnosis. Previous studies used behavioral and neuronal data during GO/NOGO task to help detect ADHD and the accuracy differed considerably from 53% to 92%, depending on the employed methods and the number of electroencephalogram (EEG) channels. It remains unclear whether data from a few EEG channels can still lead to a good accuracy of detecting ADHD. Here, we hypothesize that introducing distractions into a VR-based GO/NOGO task can augment the detection of ADHD using 6-channel EEG because children with ADHD are easily distracted. Forty-nine ADHD children and 32 typically developing children were recruited. We use a clinically applicable system with EEG to record data. Statistical analysis and machine learning methods were employed to analyze the data. The behavioral results revealed significant differences in task performance when there are distractions. The presence of distractions leads to EEG changes in both groups, indicating immaturity in inhibitory control. Importantly, the distractions additionally enhanced the between-group differences in NOGO α and γ power, reflecting insufficient inhibition in different neural networks for distraction suppression in the ADHD group. Machine learning methods further confirmed that distractions enhance the detection of ADHD with an accuracy of 85.45%. In conclusion, this system can assist in fast screenings for ADHD and the findings of neuronal correlates of distractions can help design therapeutic strategies.

17.
Microbiol Spectr ; 11(3): e0347922, 2023 06 15.
Artículo en Inglés | MEDLINE | ID: mdl-37042778

RESUMEN

In clinical microbiology, matrix-assisted laser desorption ionization-time-of-flight mass spectrometry (MALDI-TOF MS) is frequently employed for rapid microbial identification. However, rapid identification of antimicrobial resistance (AMR) in Escherichia coli based on a large amount of MALDI-TOF MS data has not yet been reported. This may be because building a prediction model to cover all E. coli isolates would be challenging given the high diversity of the E. coli population. This study aimed to develop a MALDI-TOF MS-based, data-driven, two-stage framework for characterizing different AMRs in E. coli. Specifically, amoxicillin (AMC), ceftazidime (CAZ), ciprofloxacin (CIP), ceftriaxone (CRO), and cefuroxime (CXM) were used. In the first stage, we split the data into two groups based on informative peaks according to the importance of the random forest. In the second stage, prediction models were constructed using four different machine learning algorithms-logistic regression, support vector machine, random forest, and extreme gradient boosting (XGBoost). The findings demonstrate that XGBoost outperformed the other four machine learning models. The values of the area under the receiver operating characteristic curve were 0.62, 0.72, 0.87, 0.72, and 0.72 for AMC, CAZ, CIP, CRO, and CXM, respectively. This implies that a data-driven, two-stage framework could improve accuracy by approximately 2.8%. As a result, we developed AMR prediction models for E. coli using a data-driven two-stage framework, which is promising for assisting physicians in making decisions. Further, the analysis of informative peaks in future studies could potentially reveal new insights. IMPORTANCE Based on a large amount of matrix-assisted laser desorption ionization-time-of-flight mass spectrometry (MALDI-TOF MS) clinical data, comprising 37,918 Escherichia coli isolates, a data-driven two-stage framework was established to evaluate the antimicrobial resistance of E. coli. Five antibiotics, including amoxicillin (AMC), ceftazidime (CAZ), ciprofloxacin (CIP), ceftriaxone (CRO), and cefuroxime (CXM), were considered for the two-stage model training, and the values of the area under the receiver operating characteristic curve (AUC) were 0.62 for AMC, 0.72 for CAZ, 0.87 for CIP, 0.72 for CRO, and 0.72 for CXM. Further investigations revealed that the informative peak m/z 9714 appeared with some important peaks at m/z 6809, m/z 7650, m/z 10534, and m/z 11783 for CIP and at m/z 6809, m/z 10475, and m/z 8447 for CAZ, CRO, and CXM. This framework has the potential to improve the accuracy by approximately 2.8%, indicating a promising potential for further research.


Asunto(s)
Antibacterianos , Escherichia coli , Antibacterianos/farmacología , Ceftriaxona/farmacología , Ceftazidima , Cefuroxima , Espectrometría de Masa por Láser de Matriz Asistida de Ionización Desorción/métodos , Ciprofloxacina , Amoxicilina
18.
Int J Mol Sci ; 24(5)2023 Feb 21.
Artículo en Inglés | MEDLINE | ID: mdl-36901759

RESUMEN

Cancer is one of the leading diseases threatening human life and health worldwide. Peptide-based therapies have attracted much attention in recent years. Therefore, the precise prediction of anticancer peptides (ACPs) is crucial for discovering and designing novel cancer treatments. In this study, we proposed a novel machine learning framework (GRDF) that incorporates deep graphical representation and deep forest architecture for identifying ACPs. Specifically, GRDF extracts graphical features based on the physicochemical properties of peptides and integrates their evolutionary information along with binary profiles for constructing models. Moreover, we employ the deep forest algorithm, which adopts a layer-by-layer cascade architecture similar to deep neural networks, enabling excellent performance on small datasets but without complicated tuning of hyperparameters. The experiment shows GRDF exhibits state-of-the-art performance on two elaborate datasets (Set 1 and Set 2), achieving 77.12% accuracy and 77.54% F1-score on Set 1, as well as 94.10% accuracy and 94.15% F1-score on Set 2, exceeding existing ACP prediction methods. Our models exhibit greater robustness than the baseline algorithms commonly used for other sequence analysis tasks. In addition, GRDF is well-interpretable, enabling researchers to better understand the features of peptide sequences. The promising results demonstrate that GRDF is remarkably effective in identifying ACPs. Therefore, the framework presented in this study could assist researchers in facilitating the discovery of anticancer peptides and contribute to developing novel cancer treatments.


Asunto(s)
Neoplasias , Péptidos , Humanos , Péptidos/química , Algoritmos , Secuencia de Aminoácidos , Redes Neurales de la Computación
19.
Int J Mol Sci ; 24(2)2023 Jan 05.
Artículo en Inglés | MEDLINE | ID: mdl-36674514

RESUMEN

Matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry (MS) has been used to identify microorganisms and predict antibiotic resistance. The preprocessing method for the MS spectrum is key to extracting critical information from complicated MS spectral data. Different preprocessing methods yield different data, and the optimal approach is unclear. In this study, we adopted an ensemble of multiple preprocessing methods--FlexAnalysis, MALDIquant, and continuous wavelet transform-based methods--to detect peaks and build machine learning classifiers, including logistic regressions, naïve Bayes classifiers, random forests, and a support vector machine. The aim was to identify antibiotic resistance in Acinetobacter baumannii, Acinetobacter nosocomialis, Enterococcus faecium, and Group B Streptococci (GBS) based on MALDI-TOF MS spectra collected from two branches of a referral tertiary medical center. The ensemble method was compared with the individual methods. Random forest models built with the data preprocessed by the ensemble method outperformed individual preprocessing methods and achieved the highest accuracy, with values of 84.37% (A. baumannii), 90.96% (A. nosocomialis), 78.54% (E. faecium), and 70.12% (GBS) on independent testing datasets. Through feature selection, important peaks related to antibiotic resistance could be detected from integrated information. The prediction model can provide an opinion for clinicians. The discriminative peaks enabling better prediction performance can provide a reference for further investigation of the resistance mechanism.


Asunto(s)
Infecciones por Acinetobacter , Acinetobacter baumannii , Humanos , Antibacterianos/farmacología , Espectrometría de Masa por Láser de Matriz Asistida de Ionización Desorción/métodos , Teorema de Bayes , Acinetobacter baumannii/química
20.
Brief Bioinform ; 24(2)2023 03 19.
Artículo en Inglés | MEDLINE | ID: mdl-36715277

RESUMEN

N6-methyladinosine (m6A) modification is the most abundant co-transcriptional modification in eukaryotic RNA and plays important roles in cellular regulation. Traditional high-throughput sequencing experiments used to explore functional mechanisms are time-consuming and labor-intensive, and most of the proposed methods focused on limited species types. To further understand the relevant biological mechanisms among different species with the same RNA modification, it is necessary to develop a computational scheme that can be applied to different species. To achieve this, we proposed an attention-based deep learning method, adaptive-m6A, which consists of convolutional neural network, bi-directional long short-term memory and an attention mechanism, to identify m6A sites in multiple species. In addition, three conventional machine learning (ML) methods, including support vector machine, random forest and logistic regression classifiers, were considered in this work. In addition to the performance of ML methods for multi-species prediction, the optimal performance of adaptive-m6A yielded an accuracy of 0.9832 and the area under the receiver operating characteristic curve of 0.98. Moreover, the motif analysis and cross-validation among different species were conducted to test the robustness of one model towards multiple species, which helped improve our understanding about the sequence characteristics and biological functions of RNA modifications in different species.


Asunto(s)
Aprendizaje Automático , ARN , Secuencia de Bases , ARN/genética , Redes Neurales de la Computación
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA