Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Resultados 1 - 20 de 1.306
Filtrar
Más filtros

Tipo del documento
Publication year range
1.
Brief Bioinform ; 25(2)2024 Jan 22.
Artículo en Inglés | MEDLINE | ID: mdl-38366802

RESUMEN

Anti-coronavirus peptides (ACVPs) represent a relatively novel approach of inhibiting the adsorption and fusion of the virus with human cells. Several peptide-based inhibitors showed promise as potential therapeutic drug candidates. However, identifying such peptides in laboratory experiments is both costly and time consuming. Therefore, there is growing interest in using computational methods to predict ACVPs. Here, we describe a model for the prediction of ACVPs that is based on the combination of feature engineering (FE) optimization and deep representation learning. FEOpti-ACVP was pre-trained using two feature extraction frameworks. At the next step, several machine learning approaches were tested in to construct the final algorithm. The final version of FEOpti-ACVP outperformed existing methods used for ACVPs prediction and it has the potential to become a valuable tool in ACVP drug design. A user-friendly webserver of FEOpti-ACVP can be accessed at http://servers.aibiochem.net/soft/FEOpti-ACVP/.


Asunto(s)
Algoritmos , Péptidos , Humanos , Secuencia de Aminoácidos , Péptidos/farmacología , Aprendizaje Automático
2.
Brief Bioinform ; 25(2)2024 Jan 22.
Artículo en Inglés | MEDLINE | ID: mdl-38426324

RESUMEN

Emerging clinical evidence suggests that sophisticated associations with circular ribonucleic acids (RNAs) (circRNAs) and microRNAs (miRNAs) are a critical regulatory factor of various pathological processes and play a critical role in most intricate human diseases. Nonetheless, the above correlations via wet experiments are error-prone and labor-intensive, and the underlying novel circRNA-miRNA association (CMA) has been validated by numerous existing computational methods that rely only on single correlation data. Considering the inadequacy of existing machine learning models, we propose a new model named BGF-CMAP, which combines the gradient boosting decision tree with natural language processing and graph embedding methods to infer associations between circRNAs and miRNAs. Specifically, BGF-CMAP extracts sequence attribute features and interaction behavior features by Word2vec and two homogeneous graph embedding algorithms, large-scale information network embedding and graph factorization, respectively. Multitudinous comprehensive experimental analysis revealed that BGF-CMAP successfully predicted the complex relationship between circRNAs and miRNAs with an accuracy of 82.90% and an area under receiver operating characteristic of 0.9075. Furthermore, 23 of the top 30 miRNA-associated circRNAs of the studies on data were confirmed in relevant experiences, showing that the BGF-CMAP model is superior to others. BGF-CMAP can serve as a helpful model to provide a scientific theoretical basis for the study of CMA prediction.


Asunto(s)
MicroARNs , Humanos , MicroARNs/genética , ARN Circular/genética , Curva ROC , Aprendizaje Automático , Algoritmos , Biología Computacional/métodos
3.
J Biol Chem ; 300(7): 107431, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38825006

RESUMEN

Antibiotic-resistant Enterobacterales pose a major threat to healthcare systems worldwide, necessitating the development of novel strategies to fight such hard-to-kill bacteria. One potential approach is to develop molecules that force bacteria to hyper-activate prodrug antibiotics, thus rendering them more effective. In the present work, we aimed to obtain proof-of-concept data to support that small molecules targeting transcriptional regulators can potentiate the antibiotic activity of the prodrug metronidazole (MTZ) against Escherichia coli under aerobic conditions. By screening a chemical library of small molecules, a series of structurally related molecules were identified that had little inherent antibiotic activity but showed substantial activity in combination with ineffective concentrations of MTZ. Transcriptome analyses, functional genetics, thermal shift assays, and electrophoretic mobility shift assays were then used to demonstrate that these MTZ boosters target the transcriptional repressor MarR, resulting in the upregulation of the marRAB operon and its downstream MarA regulon. The associated upregulation of the flavin-containing nitroreductase, NfsA, was then shown to be critical for the booster-mediated potentiation of MTZ antibiotic activity. Transcriptomic studies, biochemical assays, and electron paramagnetic resonance measurements were then used to show that under aerobic conditions, NfsA catalyzed 1-electron reduction of MTZ to the MTZ radical anion which in turn induced lethal DNA damage in E. coli. This work reports the first example of prodrug boosting in Enterobacterales by transcriptional modulators and highlights that MTZ antibiotic activity can be chemically induced under anaerobic growth conditions.


Asunto(s)
Antibacterianos , Proteínas de Escherichia coli , Escherichia coli , Metronidazol , Nitrorreductasas , Proteínas Represoras , Nitrorreductasas/metabolismo , Nitrorreductasas/genética , Escherichia coli/efectos de los fármacos , Escherichia coli/metabolismo , Escherichia coli/genética , Metronidazol/farmacología , Proteínas de Escherichia coli/metabolismo , Proteínas de Escherichia coli/genética , Antibacterianos/farmacología , Antibacterianos/química , Aerobiosis , Proteínas Represoras/metabolismo , Proteínas Represoras/genética , Regulación Bacteriana de la Expresión Génica/efectos de los fármacos , Bibliotecas de Moléculas Pequeñas/farmacología , Bibliotecas de Moléculas Pequeñas/química
4.
Mol Biol Evol ; 41(7)2024 Jul 03.
Artículo en Inglés | MEDLINE | ID: mdl-38934805

RESUMEN

Most algorithms that are used to predict the effects of variants rely on evolutionary conservation. However, a majority of such techniques compute evolutionary conservation by solely using the alignment of multiple sequences while overlooking the evolutionary context of substitution events. We had introduced PHACT, a scoring-based pathogenicity predictor for missense mutations that can leverage phylogenetic trees, in our previous study. By building on this foundation, we now propose PHACTboost, a gradient boosting tree-based classifier that combines PHACT scores with information from multiple sequence alignments, phylogenetic trees, and ancestral reconstruction. By learning from data, PHACTboost outperforms PHACT. Furthermore, the results of comprehensive experiments on carefully constructed sets of variants demonstrated that PHACTboost can outperform 40 prevalent pathogenicity predictors reported in the dbNSFP, including conventional tools, metapredictors, and deep learning-based approaches as well as more recent tools such as AlphaMissense, EVE, and CPT-1. The superiority of PHACTboost over these methods was particularly evident in case of hard variants for which different pathogenicity predictors offered conflicting results. We provide predictions of 215 million amino acid alterations over 20,191 proteins. PHACTboost is available at https://github.com/CompGenomeLab/PHACTboost. PHACTboost can improve our understanding of genetic diseases and facilitate more accurate diagnoses.


Asunto(s)
Mutación Missense , Filogenia , Humanos , Programas Informáticos , Biología Computacional/métodos , Algoritmos , Alineación de Secuencia
5.
Brief Bioinform ; 25(1)2023 11 22.
Artículo en Inglés | MEDLINE | ID: mdl-38127089

RESUMEN

Long noncoding RNAs (lncRNAs) participate in various biological processes and have close linkages with diseases. In vivo and in vitro experiments have validated many associations between lncRNAs and diseases. However, biological experiments are time-consuming and expensive. Here, we introduce LDA-VGHB, an lncRNA-disease association (LDA) identification framework, by incorporating feature extraction based on singular value decomposition and variational graph autoencoder and LDA classification based on heterogeneous Newton boosting machine. LDA-VGHB was compared with four classical LDA prediction methods (i.e. SDLDA, LDNFSGB, IPCARF and LDASR) and four popular boosting models (XGBoost, AdaBoost, CatBoost and LightGBM) under 5-fold cross-validations on lncRNAs, diseases, lncRNA-disease pairs and independent lncRNAs and independent diseases, respectively. It greatly outperformed the other methods with its prominent performance under four different cross-validations on the lncRNADisease and MNDR databases. We further investigated potential lncRNAs for lung cancer, breast cancer, colorectal cancer and kidney neoplasms and inferred the top 20 lncRNAs associated with them among all their unobserved lncRNAs. The results showed that most of the predicted top 20 lncRNAs have been verified by biomedical experiments provided by the Lnc2Cancer 3.0, lncRNADisease v2.0 and RNADisease databases as well as publications. We found that HAR1A, KCNQ1DN, ZFAT-AS1 and HAR1B could associate with lung cancer, breast cancer, colorectal cancer and kidney neoplasms, respectively. The results need further biological experimental validation. We foresee that LDA-VGHB was capable of identifying possible lncRNAs for complex diseases. LDA-VGHB is publicly available at https://github.com/plhhnu/LDA-VGHB.


Asunto(s)
Neoplasias de la Mama , Neoplasias Colorrectales , Neoplasias Renales , Neoplasias Pulmonares , ARN Largo no Codificante , Humanos , Femenino , ARN Largo no Codificante/genética , Bases de Datos Factuales , Neoplasias Pulmonares/genética , Neoplasias de la Mama/genética
6.
Methods ; 229: 1-8, 2024 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-38768932

RESUMEN

SARS-CoV-2's global spread has instigated a critical health and economic emergency, impacting countless individuals. Understanding the virus's phosphorylation sites is vital to unravel the molecular intricacies of the infection and subsequent changes in host cellular processes. Several computational methods have been proposed to identify phosphorylation sites, typically focusing on specific residue (S/T) or Y phosphorylation sites. Unfortunately, current predictive tools perform best on these specific residues and may not extend their efficacy to other residues, emphasizing the urgent need for enhanced methodologies. In this study, we developed a novel predictor that integrated all the residues (STY) phosphorylation sites information. We extracted ten different feature descriptors, primarily derived from composition, evolutionary, and position-specific information, and assessed their discriminative power through five classifiers. Our results indicated that Light Gradient Boosting (LGB) showed superior performance, and five descriptors displayed excellent discriminative capabilities. Subsequently, we identified the top two integrated features have high discriminative capability and trained with LGB to develop the final prediction model, LGB-IPs. The proposed approach shows an excellent performance on 10-fold cross-validation with an ACC, MCC, and AUC values of 0.831, 0.662, 0.907, respectively. Notably, these performances are replicated in the independent evaluation. Consequently, our approach may provide valuable insights into the phosphorylation mechanisms in SARS-CoV-2 infection for biomedical researchers.


Asunto(s)
COVID-19 , Biología Computacional , SARS-CoV-2 , Fosforilación , SARS-CoV-2/metabolismo , Humanos , COVID-19/virología , COVID-19/metabolismo , Biología Computacional/métodos
7.
Methods ; 223: 56-64, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38237792

RESUMEN

DNA-binding proteins are a class of proteins that can interact with DNA molecules through physical and chemical interactions. Their main functions include regulating gene expression, maintaining chromosome structure and stability, and more. DNA-binding proteins play a crucial role in cellular and molecular biology, as they are essential for maintaining normal cellular physiological functions and adapting to environmental changes. The prediction of DNA-binding proteins has been a hot topic in the field of bioinformatics. The key to accurately classifying DNA-binding proteins is to find suitable feature sources and explore the information they contain. Although there are already many models for predicting DNA-binding proteins, there is still room for improvement in mining feature source information and calculation methods. In this study, we created a model called DBPboost to better identify DNA-binding proteins. The innovation of this study lies in the use of eight feature extraction methods, the improvement of the feature selection step, which involves selecting some features first and then performing feature selection again after feature fusion, and the optimization of the differential evolution algorithm in feature fusion, which improves the performance of feature fusion. The experimental results show that the prediction accuracy of the model on the UniSwiss dataset is 89.32%, and the sensitivity is 89.01%, which is better than most existing models.


Asunto(s)
Proteínas de Unión al ADN , Máquina de Vectores de Soporte , Proteínas de Unión al ADN/química , Algoritmos , ADN/química , Biología Computacional/métodos
8.
BMC Bioinformatics ; 25(1): 265, 2024 Aug 13.
Artículo en Inglés | MEDLINE | ID: mdl-39138564

RESUMEN

BACKGROUND: Survival analysis has been used to characterize the time-to-event data. In medical studies, a typical application is to analyze the survival time of specific cancers by using high-dimensional gene expressions. The main challenges include the involvement of non-informaive gene expressions and possibly nonlinear relationship between survival time and gene expressions. Moreover, due to possibly imprecise data collection or wrong record, measurement error might be ubiquitous in the survival time and its censoring status. Ignoring measurement error effects may incur biased estimator and wrong conclusion. RESULTS: To tackle those challenges and derive a reliable estimation with efficiently computational implementation, we develop the R package AFFECT, which is referred to Accelerated Functional Failure time model with Error-Contaminated survival Times. CONCLUSIONS: This package aims to correct for measurement error effects in survival times and implements a boosting algorithm under corrected data to determine informative gene expressions as well as derive the corresponding nonlinear functions.


Asunto(s)
Algoritmos , Humanos , Análisis de Supervivencia , Neoplasias/genética , Neoplasias/mortalidad , Programas Informáticos , Perfilación de la Expresión Génica/métodos , Expresión Génica/genética
9.
BMC Bioinformatics ; 25(1): 188, 2024 May 14.
Artículo en Inglés | MEDLINE | ID: mdl-38745112

RESUMEN

BACKGROUND: Microbiome dysbiosis has recently been associated with different diseases and disorders. In this context, machine learning (ML) approaches can be useful either to identify new patterns or learn predictive models. However, data to be fed to ML methods can be subject to different sampling, sequencing and preprocessing techniques. Each different choice in the pipeline can lead to a different view (i.e., feature set) of the same individuals, that classical (single-view) ML approaches may fail to simultaneously consider. Moreover, some views may be incomplete, i.e., some individuals may be missing in some views, possibly due to the absence of some measurements or to the fact that some features are not available/applicable for all the individuals. Multi-view learning methods can represent a possible solution to consider multiple feature sets for the same individuals, but most existing multi-view learning methods are limited to binary classification tasks or cannot work with incomplete views. RESULTS: We propose irBoost.SH, an extension of the multi-view boosting algorithm rBoost.SH, based on multi-armed bandits. irBoost.SH solves multi-class classification tasks and can analyze incomplete views. At each iteration, it identifies one winning view using adversarial multi-armed bandits and uses its predictions to update a shared instance weight distribution in a learning process based on boosting. In our experiments, performed on 5 multi-view microbiome datasets, the model learned by irBoost.SH always outperforms the best model learned from a single view, its closest competitor rBoost.SH, and the model learned by a multi-view approach based on feature concatenation, reaching an improvement of 11.8% of the F1-score in the prediction of the Autism Spectrum disorder and of 114% in the prediction of the Colorectal Cancer disease. CONCLUSIONS: The proposed method irBoost.SH exhibited outstanding performances in our experiments, also compared to competitor approaches. The obtained results confirm that irBoost.SH can fruitfully be adopted for the analysis of microbiome data, due to its capability to simultaneously exploit multiple feature sets obtained through different sequencing and preprocessing pipelines.


Asunto(s)
Algoritmos , Aprendizaje Automático , Microbiota , Humanos
10.
BMC Bioinformatics ; 25(1): 282, 2024 Aug 28.
Artículo en Inglés | MEDLINE | ID: mdl-39198740

RESUMEN

BACKGROUND: Thermostability is a fundamental property of proteins to maintain their biological functions. Predicting protein stability changes upon mutation is important for our understanding protein structure-function relationship, and is also of great interest in protein engineering and pharmaceutical design. RESULTS: Here we present mutDDG-SSM, a deep learning-based framework that uses the geometric representations encoded in protein structure to predict the mutation-induced protein stability changes. mutDDG-SSM consists of two parts: a graph attention network-based protein structural feature extractor that is trained with a self-supervised learning scheme using large-scale high-resolution protein structures, and an eXtreme Gradient Boosting model-based stability change predictor with an advantage of alleviating overfitting problem. The performance of mutDDG-SSM was tested on several widely-used independent datasets. Then, myoglobin and p53 were used as case studies to illustrate the effectiveness of the model in predicting protein stability changes upon mutations. Our results show that mutDDG-SSM achieved high performance in estimating the effects of mutations on protein stability. In addition, mutDDG-SSM exhibited good unbiasedness, where the prediction accuracy on the inverse mutations is as well as that on the direct mutations. CONCLUSION: Meaningful features can be extracted from our pre-trained model to build downstream tasks and our model may serve as a valuable tool for protein engineering and drug design.


Asunto(s)
Mutación , Estabilidad Proteica , Proteínas , Proteínas/química , Proteínas/genética , Proteínas/metabolismo , Mioglobina/química , Mioglobina/genética , Proteína p53 Supresora de Tumor/genética , Proteína p53 Supresora de Tumor/química , Proteína p53 Supresora de Tumor/metabolismo , Biología Computacional/métodos , Aprendizaje Profundo , Aprendizaje Automático Supervisado , Bases de Datos de Proteínas , Conformación Proteica
11.
BMC Bioinformatics ; 25(1): 135, 2024 Mar 28.
Artículo en Inglés | MEDLINE | ID: mdl-38549073

RESUMEN

Michaelis constant (KM) is one of essential parameters for enzymes kinetics in the fields of protein engineering, enzyme engineering, and synthetic biology. As overwhelming experimental measurements of KM are difficult and time-consuming, prediction of the KM values from machine and deep learning models would increase the pace of the enzymes kinetics studies. Existing machine and deep learning models are limited to the specific enzymes, i.e., a minority of enzymes or wildtype enzymes. Here, we used a deep learning framework PaddlePaddle to implement a machine and deep learning approach (GraphKM) for KM prediction of wildtype and mutant enzymes. GraphKM is composed by graph neural networks (GNN), fully connected layers and gradient boosting framework. We represented the substrates through molecular graph and the enzymes through a pretrained transformer-based language model to construct the model inputs. We compared the difference of the model results made by the different GNN (GIN, GAT, GCN, and GAT-GCN). The GAT-GCN-based model generally outperformed. To evaluate the prediction performance of the GraphKM and other reported KM prediction models, we collected an independent KM dataset (HXKm) from literatures.


Asunto(s)
Aprendizaje Profundo , Suministros de Energía Eléctrica , Lenguaje , Redes Neurales de la Computación , Ingeniería de Proteínas
12.
BMC Med ; 22(1): 78, 2024 Feb 20.
Artículo en Inglés | MEDLINE | ID: mdl-38378570

RESUMEN

BACKGROUND: The immunity induced by primary vaccination is effective against COVID-19; however, booster vaccines are needed to maintain vaccine-induced immunity and improve protection against emerging variants. Heterologous boosting is believed to result in more robust immune responses. This study investigated the safety and immunogenicity of the Razi Cov Pars vaccine (RCP) as a heterologous booster dose in people primed with Beijing Bio-Institute of Biological Products Coronavirus Vaccine (BBIBP-CorV). METHODS: We conducted a randomized, double-blind, active-controlled trial in adults aged 18 and over primarily vaccinated with BBIBP-CorV, an inactivated SARS-CoV-2 vaccine. Eligible participants were randomly assigned (1:1) to receive a booster dose of RCP or BBIBP-CorV vaccines. The primary outcome was neutralizing antibody activity measured by a conventional virus neutralization test (cVNT). The secondary efficacy outcomes included specific IgG antibodies against SARS-CoV-2 spike (S1 and receptor-binding domain, RBD) antigens and cell-mediated immunity. We measured humoral antibody responses at 2 weeks (in all participants) and 3 and 6 months (a subgroup of 101 participants) after the booster dose injection. The secondary safety outcomes were solicited and unsolicited immediate, local, and systemic adverse reactions. RESULTS: We recruited 483 eligible participants between December 7, 2021, and January 13, 2022. The mean age was 51.9 years, and 68.1% were men. Neutralizing antibody titers increased about 3 (geometric mean fold increase, GMFI = 2.77, 95% CI 2.26-3.39) and 21 (GMFI = 21.51, 95% CI 16.35-28.32) times compared to the baseline in the BBIBP-CorV and the RCP vaccine groups. Geometric mean ratios (GMR) and 95% CI for serum neutralizing antibody titers for RCP compared with BBIBP-CorV on days 14, 90, and 180 were 6.81 (5.32-8.72), 1.77 (1.15-2.72), and 2.37 (1.62-3.47) respectively. We observed a similar pattern for specific antibody responses against S1 and RBD. We detected a rise in gamma interferon (IFN-γ), tumor necrosis factor (TNF-α), and interleukin 2 (IL-2) following stimulation with S antigen, particularly in the RCP group, and the flow cytometry examination showed an increase in the percentage of CD3 + /CD8 + lymphocytes. RCP and BBIBP-CorV had similar safety profiles; we identified no vaccine-related or unrelated deaths. CONCLUSIONS: BBIBP-CorV and RCP vaccines as booster doses are safe and provide a strong immune response that is more robust when the RCP vaccine is used. Heterologous vaccines are preferred as booster doses. TRIAL REGISTRATION: This study was registered with the Iranian Registry of Clinical Trial at www.irct.ir , IRCT20201214049709N4. Registered 29 November 2021.


Asunto(s)
Vacunas contra la COVID-19 , Glicoproteína de la Espiga del Coronavirus , Vacunas de Productos Inactivados , Adulto , Masculino , Humanos , Adolescente , Persona de Mediana Edad , Femenino , Vacunas contra la COVID-19/efectos adversos , Irán , Anticuerpos Neutralizantes , Anticuerpos Antivirales
13.
Brief Bioinform ; 23(6)2022 11 19.
Artículo en Inglés | MEDLINE | ID: mdl-36377749

RESUMEN

MicroRNAs (miRNAs) are closely related to a variety of human diseases, not only regulating gene expression, but also having an important role in human life activities and being viable targets of small molecule drugs for disease treatment. Current computational techniques to predict the potential associations between small molecule and miRNA are not that accurate. Here, we proposed a new computational method based on a deep autoencoder and a scalable tree boosting model (DAESTB), to predict associations between small molecule and miRNA. First, we constructed a high-dimensional feature matrix by integrating small molecule-small molecule similarity, miRNA-miRNA similarity and known small molecule-miRNA associations. Second, we reduced feature dimensionality on the integrated matrix using a deep autoencoder to obtain the potential feature representation of each small molecule-miRNA pair. Finally, a scalable tree boosting model is used to predict small molecule and miRNA potential associations. The experiments on two datasets demonstrated the superiority of DAESTB over various state-of-the-art methods. DAESTB achieved the best AUC value. Furthermore, in three case studies, a large number of predicted associations by DAESTB are confirmed with the public accessed literature. We envision that DAESTB could serve as a useful biological model for predicting potential small molecule-miRNA associations.


Asunto(s)
MicroARNs , Humanos , Algoritmos , Biología Computacional/métodos , Predisposición Genética a la Enfermedad , MicroARNs/genética , MicroARNs/metabolismo , Modelos Biológicos
14.
Brief Bioinform ; 23(1)2022 01 17.
Artículo en Inglés | MEDLINE | ID: mdl-34850821

RESUMEN

2'-O-methylation (Nm) is a post-transcriptional modification of RNA that is catalyzed by 2'-O-methyltransferase and involves replacing the H on the 2'-hydroxyl group with a methyl group. The 2'-O-methylation modification site is detected in a variety of RNA types (miRNA, tRNA, mRNA, etc.), plays an important role in biological processes and is associated with different diseases. There are few functional mechanisms developed at present, and traditional high-throughput experiments are time-consuming and expensive to explore functional mechanisms. For a deeper understanding of relevant biological mechanisms, it is necessary to develop efficient and accurate recognition tools based on machine learning. Based on this, we constructed a predictor called NmRF based on optimal mixed features and random forest classifier to identify 2'-O-methylation modification sites. The predictor can identify modification sites of multiple species at the same time. To obtain a better prediction model, a two-step strategy is adopted; that is, the optimal hybrid feature set is obtained by combining the light gradient boosting algorithm and incremental feature selection strategy. In 10-fold cross-validation, the accuracies of Homo sapiens and Saccharomyces cerevisiae were 89.069 and 93.885%, and the AUC were 0.9498 and 0.9832, respectively. The rigorous 10-fold cross-validation and independent tests confirm that the proposed method is significantly better than existing tools. A user-friendly web server is accessible at http://lab.malab.cn/∼acy/NmRF.


Asunto(s)
Biología Computacional , Aprendizaje Automático , Secuencia de Bases , Biología Computacional/métodos , Humanos , Metilación , ARN/genética
15.
J Transl Med ; 22(1): 140, 2024 02 07.
Artículo en Inglés | MEDLINE | ID: mdl-38321494

RESUMEN

Building Single Sample Predictors (SSPs) from gene expression profiles presents challenges, notably due to the lack of calibration across diverse gene expression measurement technologies. However, recent research indicates the viability of classifying phenotypes based on the order of expression of multiple genes. Existing SSP methods often rely on Top Scoring Pairs (TSP), which are platform-independent and easy to interpret through the concept of "relative expression reversals". Nevertheless, TSP methods face limitations in classifying complex patterns involving comparisons of more than two gene expressions. To overcome these constraints, we introduce a novel approach that extends TSP rules by constructing rank-based trees capable of encompassing extensive gene-gene comparisons. This method is bolstered by incorporating two ensemble strategies, boosting and random forest, to mitigate the risk of overfitting. Our implementation of ensemble rank-based trees employs boosting with LogitBoost cost and random forests, addressing both binary and multi-class classification problems. In a comparative analysis across 12 cancer gene expression datasets, our proposed methods demonstrate superior performance over both the k-TSP classifier and nearest template prediction methods. We have further refined our approach to facilitate variable selection and the generation of clear, precise decision rules from rank-based trees, enhancing interpretability. The cumulative evidence from our research underscores the significant potential of ensemble rank-based trees in advancing disease classification via gene expression data, offering a robust, interpretable, and scalable solution. Our software is available at https://CRAN.R-project.org/package=ranktreeEnsemble .


Asunto(s)
Neoplasias , Transcriptoma , Humanos , Programas Informáticos , Neoplasias/genética , Oncogenes , Algoritmos
16.
J Med Virol ; 96(3): e29542, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38506170

RESUMEN

The emerging new variants of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) needs booster vaccination. We evaluated the long-term safety and immunogenicity of heterologous boosting with a SARS-CoV-2 messenger RNA vaccine SYS6006. A total of 1000 participants aged 18 years or more who had received two (Group A) or three (Group B) doses of SARS-CoV-2 inactivated vaccine were enrolled and vaccinated with one dose of SYS6006 which was designed based on the prototype spike protein and introduced mutation sites. Adverse events (AEs) through 30 days and serious AEs during the study were collected. Live-virus and pseudovirus neutralizing antibody (Nab), binding antibody (immunoglobulin G [IgG]) and cellular immunity were tested through 180 days. Solicited all, injection-site and systemic AEs were reported by 618 (61.8%), 498 (49.8%), and 386 (38.6%) participants, respectively. Most AEs were grade 1. The two groups had similar safety profile. No vaccination-related SAEs were reported. Robust wild-type (WT) live-virus Nab response was elicited with peak geometric mean titers (GMTs) of 3769.5 (Group A) and 5994.7 (Group B) on day 14, corresponding to 1602.5- and 290.8-fold increase versus baseline, respectively. The BA.5 live-virus Nab GMTs were 87.7 (Group A) and 93.2 (Group B) on day 14. All participants seroconverted for WT live-virus Nab. Robust pseudovirus Nab and IgG responses to wild type and BA.5 were also elicited. ELISpot assay showed robust cellular immune response, which was not obviously affected by virus variation. In conclusion, SYS6006 heterologous boosting demonstrated long-term good safety and immunogenicity in participants who had received two or three doses of SARS-CoV-2 inactivated vaccine.


Asunto(s)
Vacunas contra la COVID-19 , COVID-19 , Inmunogenicidad Vacunal , Humanos , Anticuerpos Neutralizantes , Anticuerpos Antivirales , China , COVID-19/prevención & control , Inmunoglobulina G , Vacunas de ARNm , Vacunas de Productos Inactivados
17.
J Card Fail ; 2024 Sep 17.
Artículo en Inglés | MEDLINE | ID: mdl-39299541

RESUMEN

INTRODUCTION: Optimal management of outpatients with heart failure (HF) requires serially updating the estimates of their risk for adverse clinical outcomes to guide treatment. Patient-reported outcomes (PROs) are becoming increasingly used in clinical care. The purpose of this study was to determine whether inclusion of PROs can improve the risk prediction for HF hospitalization and death in ambulatory HF patients. METHODS: We included consecutive patients with HF with reduced ejection fraction (HFrEF) and HF with preserved ejection fraction (HFpEF) seen in a HF clinic between 2015 and 2019 who completed PROs as part of routine care. Cox regression with a least absolute shrinkage and selection operator (LASSO) regularization and gradient boosting machine (GBM) analyses were used to estimate risk for a combined outcome of HF hospitalization, heart transplant, left ventricular assist device implantation or death. The performance of the prediction models was evaluated with the time-dependent concordance index (Cτ). RESULTS: Among 1165 patients with HFrEF (mean age 59.1±16.1, 68% male) the median follow-up was 487 days and among 456 patients with HFpEF (mean age: 64.2±16.0 years, 55% male) the median follow-up was 494 days. Gradient boosting regression that included PROs had the best prediction performance - Cτ 0.73 for patients with HFrEF and 0.74 in patients with HFpEF, and showed very good stratification of risk by time to event analysis by quintile of risk. The Kansas City Cardiomyopathy Questionnaire overall summary score (KCCQ-12 OSS), Visual Analogue Scale (VAS) and Patient Reported Outcomes Measurement Information System (PROMIS) dimensions of Satisfaction with social roles and Physical function had high variable importance measure in the models. CONCLUSIONS: PROs improve risk prediction in both HFrEF and HFpEF, independent of traditional clinical factors. Routine assessment of PROs and leveraging the comprehensive data in the electronic health record in routine clinical care could help more accurately assess risk and support the intensification of treatment in patients with HF.

18.
Chemphyschem ; : e202400629, 2024 Jul 09.
Artículo en Inglés | MEDLINE | ID: mdl-38982718

RESUMEN

Electrode materials are essential in the electrochemical process of storing charge in supercapacitors and have a significant impact on the cost and capacitive performance of the final product. Hence, it is imperative to make precise predictions regarding the capacitance of electrode materials in order to further the development of supercapacitors. MgCo2O4, with a theoretical capacitance of up to 3122 F g-1, holds immense research value as an electrode material. The objective of this study is to predict the capacitance of MgCo2O4 with high accuracy. This will be achieved by extracting numerous data from published papers and using some parameters as input features. The Recursive Feature Elimination (RFE) method was employed, using Random Forest (RF), Extreme Gradient Boosting (XGBoost) and Regression Tree (RT) as selectors to identify the optimal feature subset. Then, combining them with these three regression models to construct nine machine learning (ML) models. After performance evaluation and outlier analysis, the XGB-RFE-XGB model achieved R-squared (R²), root mean squared error (RMSE), and mean absolute error (MAE) of 0.95, 111.83 F g-1 and 68.25 F g-1, respectively, demonstrating its stability and reliability. Therefore, the XGB-RFE-XGB model can be used as a reliable predictive tool in subsequent experimental designs.

19.
Br J Clin Pharmacol ; 90(3): 691-699, 2024 03.
Artículo en Inglés | MEDLINE | ID: mdl-37845041

RESUMEN

AIMS: Heart failure with reduced ejection fraction (HFrEF) poses significant challenges for clinicians and researchers, owing to its multifaceted aetiology and complex treatment regimens. In light of this, artificial intelligence methods offer an innovative approach to identifying relationships within complex clinical datasets. Our study aims to explore the potential for machine learning algorithms to provide deeper insights into datasets of HFrEF patients. METHODS: To this end, we analysed a cohort of 386 HFrEF patients who had been initiated on sodium-glucose co-transporter-2 inhibitor treatment and had completed a minimum of a 6-month follow-up. RESULTS: In traditional frequentist statistical analyses, patients receiving the highest doses of beta-blockers (BBs) (chi-square test, P = .036) and those newly initiated on sacubitril-valsartan (chi-square test, P = .023) showed better outcomes. However, none of these pharmacological features stood out as independent predictors of improved outcomes in the Cox proportional hazards model. In contrast, when employing eXtreme Gradient Boosting (XGBoost) algorithms in conjunction with the data using Shapley additive explanations (SHAP), we identified several models with significant predictive power. The XGBoost algorithm inherently accommodates non-linear distribution, multicollinearity and confounding. Within this framework, pharmacological categories like 'newly initiated treatment with sacubitril/valsartan' and 'BB dose escalation' emerged as strong predictors of long-term outcomes. CONCLUSIONS: In this manuscript, we not only emphasize the strengths of this machine learning approach but also discuss its potential limitations and the risk of identifying statistically significant yet clinically irrelevant predictors.


Asunto(s)
Insuficiencia Cardíaca , Humanos , Insuficiencia Cardíaca/tratamiento farmacológico , Insuficiencia Cardíaca/inducido químicamente , Tetrazoles/efectos adversos , Inteligencia Artificial , Volumen Sistólico , Aprendizaje Automático
20.
Clin Transplant ; 38(4): e15316, 2024 04.
Artículo en Inglés | MEDLINE | ID: mdl-38607291

RESUMEN

BACKGROUND: The incidence of graft failure following liver transplantation (LTx) is consistent. While traditional risk scores for LTx have limited accuracy, the potential of machine learning (ML) in this area remains uncertain, despite its promise in other transplant domains. This study aims to determine ML's predictive limitations in LTx by replicating methods used in previous heart transplant research. METHODS: This study utilized the UNOS STAR database, selecting 64,384 adult patients who underwent LTx between 2010 and 2020. Gradient boosting models (XGBoost and LightGBM) were used to predict 14, 30, and 90-day graft failure compared to conventional logistic regression model. Models were evaluated using both shuffled and rolling cross-validation (CV) methodologies. Model performance was assessed using the AUC across validation iterations. RESULTS: In a study comparing predictive models for 14-day, 30-day and 90-day graft survival, LightGBM consistently outperformed other models, achieving the highest AUC of.740,.722, and.700 in shuffled CV methods. However, in rolling CV the accuracy of the model declined across every ML algorithm. The analysis revealed influential factors for graft survival prediction across all models, including total bilirubin, medical condition, recipient age, and donor AST, among others. Several features like donor age and recipient diabetes history were important in two out of three models. CONCLUSIONS: LightGBM enhances short-term graft survival predictions post-LTx. However, due to changing medical practices and selection criteria, continuous model evaluation is essential. Future studies should focus on temporal variations, clinical implications, and ensure model transparency for broader medical utility.


Asunto(s)
Trasplante de Hígado , Adulto , Humanos , Trasplante de Hígado/efectos adversos , Proyectos de Investigación , Algoritmos , Bilirrubina , Aprendizaje Automático
SELECCIÓN DE REFERENCIAS
Detalles de la búsqueda