Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 280
Filter
1.
Brief Bioinform ; 25(6)2024 Sep 23.
Article in English | MEDLINE | ID: mdl-39350338

ABSTRACT

Accurate prediction of transcription factor binding sites (TFBSs) is essential for understanding gene regulation mechanisms and the etiology of diseases. Despite numerous advances in deep learning for predicting TFBSs, their performance can still be enhanced. In this study, we propose MLSNet, a novel deep learning architecture designed specifically to predict TFBSs. MLSNet innovatively integrates multisize convolutional fusion with long short-term memory (LSTM) networks to effectively capture DNA-sparse higher-order sequence features. Further, MLSNet incorporates super token attention and Bi-LSTM to systematically extract and integrate higher-order DNA shape features. Experimental results on 165 ChIP-seq (chromatin immunoprecipitation followed by sequencing) datasets indicate that MLSNet consistently outperforms several state-of-the-art algorithms in the prediction of TFBSs. Specifically, MLSNet reports average metrics: 0.8306 for ACC, 0.8992 for AUROC, and 0.9035 for AUPRC, surpassing the second-best methods by 1.82%, 1.68%, and 1.54%, respectively. This research delineates the effectiveness of combining multi-size convolutional layers with LSTM and DNA shape-based features in enhancing predictive accuracy. Moreover, this study comprehensively assesses the variability in model performance across different cell lines and transcription factors. The source code of MLSNet is available at https://github.com/minghaidea/MLSNet.


Subject(s)
Deep Learning , Transcription Factors , Transcription Factors/metabolism , Binding Sites , Algorithms , Computational Biology/methods , Humans , Chromatin Immunoprecipitation Sequencing/methods , DNA/metabolism , DNA/chemistry
2.
Brief Bioinform ; 25(5)2024 Jul 25.
Article in English | MEDLINE | ID: mdl-39276327

ABSTRACT

Recent advancements in high-throughput sequencing technologies have significantly enhanced our ability to unravel the intricacies of gene regulatory processes. A critical challenge in this endeavor is the identification of variant effects, a key factor in comprehending the mechanisms underlying gene regulation. Non-coding variants, constituting over 90% of all variants, have garnered increasing attention in recent years. The exploration of gene variant impacts and regulatory mechanisms has spurred the development of various deep learning approaches, providing new insights into the global regulatory landscape through the analysis of extensive genetic data. Here, we provide a comprehensive overview of the development of the non-coding variants models based on bulk and single-cell sequencing data and their model-based interpretation and downstream tasks. This review delineates the popular sequencing technologies for epigenetic profiling and deep learning approaches for discerning the effects of non-coding variants. Additionally, we summarize the limitations of current approaches in variant effect prediction research and outline opportunities for improvement. We anticipate that our study will offer a practical and useful guide for the bioinformatic community to further advance the unraveling of genetic variant effects.


Subject(s)
Deep Learning , Genetic Variation , Humans , High-Throughput Nucleotide Sequencing/methods , Computational Biology/methods , Epigenesis, Genetic
3.
Ecotoxicol Environ Saf ; 285: 117023, 2024 Sep 14.
Article in English | MEDLINE | ID: mdl-39278001

ABSTRACT

Wildfires have devastating effects on society and public health. However, little evidence from population-based cohort has been performed to analyze the relationship of wildfire-related PM2.5, an important component of wildfire smoke, with cancer-specific mortality. We aimed to explore this relationship and identify vulnerable populations in UK with lower levels of wildfire-related PM2.5 exposure. The study consisted of 492394 participants (age: 38-73 years) recruited by UK Biobank during 2004-2010. The cumulative wildfire-related PM2.5 within 10 kilometers of residence over three years was used as exposure, which was assessed by chemical transport and machine learning models. A time-varying Cox regression was utilized to explore the relationship of exposure with diverse cancer-specific mortality outcomes. Subgroup analyses of a range of potential modifiers were performed. Each 10 µg/m3 increment of 3-year cumulative exposure was related to a 0.4 % greater risk of total cancer (95 %CI: 1.001-1.007), a 1.1 % greater risk of lung cancer (95 %CI: 1.004-1.018), and a 2.7 % greater risk of lip, oral cavity and pharynx (LOP) cancer (95 %CI: 1.005-1.049). Higher vulnerability in the wildfire-related PM2.5-lung cancer relationship was found among participants being retired than those with other employment status. Even lower levels of exposure to PM2.5 from wildfires were related to elevated mortality risks for cancer from total, lung, LOP, highlighting the importance of wildfire prevention and control. Further investigations are warranted to enrich and extend existing knowledge in this field.

4.
Gigascience ; 132024 Jan 02.
Article in English | MEDLINE | ID: mdl-39320317

ABSTRACT

BACKGROUND: Antimicrobial resistance is a serious threat to global health. Due to the stagnant antibiotic discovery pipeline, bacteriophages (phages) have been proposed as an alternative therapy for the treatment of infections caused by multidrug-resistant pathogens. Genomic features play an important role in phage pharmacology. However, our knowledge of phage genomics is sparse, and the use of existing bioinformatic pipelines and tools requires considerable bioinformatic expertise. These challenges have substantially limited the clinical translation of phage therapy. FINDINGS: We have developed PhageGE (Phage Genome Explorer), a user-friendly graphical interface application for the interactive analysis of phage genomes. PhageGE enables users to perform key analyses, including phylogenetic analysis, visualization of phylogenetic trees, prediction of phage life cycle, and comparative analysis of phage genome annotations. The new R Shiny web server, PhageGE, integrates existing R packages and combines them with several newly developed functions to facilitate these analyses. Additionally, the web server provides interactive visualization capabilities and allows users to directly export publication-quality images. CONCLUSIONS: PhageGE is a valuable tool that simplifies the analysis of phage genome data and may expedite the development and clinical translation of phage therapy. PhageGE is publicly available at https://jason-zhao.shinyapps.io/PhageGE_Update/.


Subject(s)
Bacteriophages , Genome, Viral , Software , Bacteriophages/genetics , Genomics/methods , Computational Biology/methods , Internet , Phylogeny
5.
Adv Healthc Mater ; : e2403046, 2024 Sep 12.
Article in English | MEDLINE | ID: mdl-39263842

ABSTRACT

In the current battle against antibiotic resistance, the resilience of Gram-negative bacteria against traditional antibiotics is due not only to their protective outer membranes but also to mechanisms like efflux pumps and enzymatic degradation of drugs, underscores the urgent need for innovative antimicrobial tactics. Herein, this study presents an innovative method involving the synthesis of three furoxan derivatives engineered to self-assemble into nitric oxide (NO) donor nanoparticles (FuNPs). These FuNPs, notably supplied together with polymyxin B (PMB), achieve markedly enhanced bactericidal efficacy against a wide spectrum of bacterial phenotypes at considerably lower NO concentrations (0.1-2.8 µg mL-1), which is at least ten times lower than the reported data for NO donors (≥200 µg mL-1). The bactericidal mechanism is elucidated using confocal, scanning, and transmission electron microscopy techniques. Neutron reflectometry confirms that FuNPs initiate membrane disruption by specifically engaging with the polysaccharides on bacterial surfaces, causing structural perturbations. Subsequently, PMB binds to lipid A on the outer membrane, enhancing permeability and resulting in a synergistic bactericidal action with FuNPs. This pioneering strategy underscores the utility of self-assembly in NO delivery as a groundbreaking paradigm to circumvent traditional antibiotic resistance barriers, marking a significant leap forward in the development of next-generation antimicrobial agents.

6.
Nucleic Acids Res ; 2024 Sep 13.
Article in English | MEDLINE | ID: mdl-39271121

ABSTRACT

MicroRNAs (miRNAs) are short non-coding RNAs involved in various cellular processes, playing a crucial role in gene regulation. Identifying miRNA targets remains a central challenge and is pivotal for elucidating the complex gene regulatory networks. Traditional computational approaches have predominantly focused on identifying miRNA targets through perfect Watson-Crick base pairings within the seed region, referred to as canonical sites. However, emerging evidence suggests that perfect seed matches are not a prerequisite for miRNA-mediated regulation, underscoring the importance of also recognizing imperfect, or non-canonical, sites. To address this challenge, we propose Mimosa, a new computational approach that employs the Transformer framework to enhance the prediction of miRNA targets. Mimosa distinguishes itself by integrating contextual, positional and base-pairing information to capture in-depth attributes, thereby improving its predictive capabilities. Its unique ability to identify non-canonical base-pairing patterns makes Mimosa a standout model, reducing the reliance on pre-selecting candidate targets. Mimosa achieves superior performance in gene-level predictions and also shows impressive performance in site-level predictions across various non-human species through extensive benchmarking tests. To facilitate research efforts in miRNA targeting, we have developed an easy-to-use web server for comprehensive end-to-end predictions, which is publicly available at http://monash.bioweb.cloud.edu.au/Mimosa.

7.
mSystems ; 9(9): e0078924, 2024 Sep 17.
Article in English | MEDLINE | ID: mdl-39150244

ABSTRACT

Matrix-assisted laser desorption/ionization-time of flight mass spectrometry (MALDI-TOF MS) is widely used in clinical microbiology laboratories for bacterial identification but its use for detection of antimicrobial resistance (AMR) remains limited. Here, we used MALDI-TOF MS with artificial intelligence (AI) approaches to successfully predict AMR in Pseudomonas aeruginosa, a priority pathogen with complex AMR mechanisms. The highest performance was achieved for modern ß-lactam/ß-lactamase inhibitor drugs, namely, ceftazidime/avibactam and ceftolozane/tazobactam. For these drugs, the model demonstrated area under the receiver operating characteristic curve (AUROC) of 0.869 and 0.856, specificity of 0.925 and 0.897, and sensitivity of 0.731 and 0.714, respectively. As part of this work, we developed dynamic binning, a feature engineering technique that effectively reduces the high-dimensional feature set and has wide-ranging applicability to MALDI-TOF MS data. Compared to conventional feature engineering approaches, the dynamic binning method yielded highest performance in 7 of 10 antimicrobials. Moreover, we showcased the efficacy of transfer learning in enhancing the AUROC performance for 8 of 11 antimicrobials. By assessing the contribution of features to the model's prediction, we identified proteins that may contribute to AMR mechanisms. Our findings demonstrate the potential of combining AI with MALDI-TOF MS as a rapid AMR diagnostic tool for Pseudomonas aeruginosa.IMPORTANCEPseudomonas aeruginosa is a key bacterial pathogen that causes significant global morbidity and mortality. Antimicrobial resistance (AMR) emerges rapidly in P. aeruginosa and is driven by complex mechanisms. Drug-resistant P. aeruginosa is a major challenge in clinical settings due to limited treatment options. Early detection of AMR can guide antibiotic choices, improve patient outcomes, and avoid unnecessary antibiotic use. Matrix-assisted laser desorption/ionization-time of flight mass spectrometry (MALDI-TOF MS) is widely used for rapid species identification in clinical microbiology. In this study, we repurposed mass spectra generated by MALDI-TOF and used them as inputs for artificial intelligence approaches to successfully predict AMR in P. aeruginosa for multiple key antibiotic classes. This work represents an important advance toward using MALDI-TOF as a rapid AMR diagnostic for P. aeruginosa in clinical settings.


Subject(s)
Anti-Bacterial Agents , Artificial Intelligence , Pseudomonas aeruginosa , Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization , Tazobactam , Pseudomonas aeruginosa/drug effects , Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization/methods , Anti-Bacterial Agents/pharmacology , Humans , Tazobactam/pharmacology , Tazobactam/therapeutic use , Pseudomonas Infections/drug therapy , Pseudomonas Infections/microbiology , Microbial Sensitivity Tests/methods , Drug Resistance, Bacterial , Drug Combinations , Ceftazidime/pharmacology , Azabicyclo Compounds/pharmacology , Cephalosporins
8.
Int J Parasitol ; 2024 Aug 20.
Article in English | MEDLINE | ID: mdl-39168434

ABSTRACT

Millions of livestock animals worldwide are infected with the haematophagous barber's pole worm, Haemonchus contortus, the aetiological agent of haemonchosis. Despite the major significance of this parasite worldwide and its widespread resistance to current treatments, the lack of a high-quality genome for the well-defined strain of this parasite from Australia, called Haecon-5, has constrained research in a number of areas including host-parasite interactions, drug discovery and population genetics. To enable research in these areas, we report here a chromosome-contiguous genome (∼280 Mb) for Haecon-5 with high-quality models for 19,234 protein-coding genes. Comparative genomic analyses show significant genomic similarity (synteny) with a UK strain of H. contortus, called MHco3(ISE).N1 (abbreviated as "ISE"), but we also discover marked differences in genomic structure/gene arrangements, distribution of nucleotide variability (single nucleotide polymorphisms (SNPs) and indels) and orthology between Haecon-5 and ISE. We used the genome and extensive transcriptomic resources for Haecon-5 to predict a subset of essential single-copy genes employing a "cross-species" machine learning (ML) approach using a range of features from nucleotide/protein sequences, protein orthology, subcellular localisation, single-cell RNA-seq and/or histone methylation data available for the model organisms Caenorhabditis elegans and Drosophila melanogaster. From a set of 1,464 conserved single copy genes, transcribed in key life-cycle stages of H. contortus, we identified 232 genes whose homologs have critical functions in C. elegans and/or D. melanogaster, and prioritised 10 of them for further characterisation; nine of the 10 genes likely play roles in neurophysiological processes, germline, hypodermis and/or respiration, and one is an unknown (orphan) gene for which no detailed functional information exists. Future studies of these genes/gene products are warranted to elucidate their roles in parasite biology, host-parasite interplay and/or disease. Clearly, the present Haecon-5 reference genome and associated resources now underpin a broad range of fundamental investigations of H. contortus and could assist in accelerating the discovery of novel intervention targets and drug candidates to combat haemonchosis.

9.
Int J Mol Sci ; 25(16)2024 Aug 12.
Article in English | MEDLINE | ID: mdl-39201452

ABSTRACT

Haemonchus contortus (the barber's pole worm)-a highly pathogenic gastric nematode of ruminants-causes significant economic losses in the livestock industry worldwide. H. contortus has become a valuable model organism for both fundamental and applied research (e.g., drug and vaccine discovery) because of the availability of well-defined laboratory strains (e.g., MHco3(ISE).N1 in the UK and Haecon-5 in Australia) and genomic, transcriptomic and proteomic data sets. Many recent investigations have relied heavily on the use of the chromosome-contiguous genome of MHco3(ISE).N1 in the absence of a genome for Haecon-5. However, there has been no genetic comparison of these and other strains to date. Here, we assembled and characterised the mitochondrial genome (14.1 kb) of Haecon-5 and compared it with that of MHco3(ISE).N1 and two other strains (i.e., McMaster and NZ_Hco_NP) from Australasia. We detected 276 synonymous and 25 non-synonymous single nucleotide polymorphisms (SNPs) within Haecon-5. Between the Haecon-5 and MHco3(ISE).N1 strains, we recorded 345 SNPs, 31 of which were non-synonymous and linked to fixed amino acid differences in seven protein-coding genes (nad5, nad6, nad1, atp6, nad2, cytb and nad4) between these strains. Pronounced variation (344 and 435 SNPs) was seen between Haecon-5 and each of the other two strains from Australasia. The question remains as to what impact these mitogenomic mutations might have on the biology and physiology of H. contortus, which warrants exploration. The high degree of mitogenomic variability recorded here among these strains suggests that further work should be undertaken to assess the nature and extent of the nuclear genomic variation within H. contortus.


Subject(s)
Genome, Mitochondrial , Haemonchus , Polymorphism, Single Nucleotide , Animals , Haemonchus/genetics , Phylogeny , Genetic Variation , Australia
10.
J Chem Inf Model ; 64(15): 6216-6229, 2024 Aug 12.
Article in English | MEDLINE | ID: mdl-39092854

ABSTRACT

The critical importance of accurately predicting mutations in protein metal-binding sites for advancing drug discovery and enhancing disease diagnostic processes cannot be overstated. In response to this imperative, MetalTrans emerges as an accurate predictor for disease-associated mutations in protein metal-binding sites. The core innovation of MetalTrans lies in its seamless integration of multifeature splicing with the Transformer framework, a strategy that ensures exhaustive feature extraction. Central to MetalTrans's effectiveness is its deep feature combination strategy, which merges evolutionary-scale modeling amino acid embeddings with ProtTrans embeddings, thus shedding light on the biochemical properties of proteins. Employing the Transformer component, MetalTrans leverages the self-attention mechanism to delve into higher-level representations. Utilizing mutation site information for feature fusion not only enriches the feature set but also sidesteps the common pitfall of overestimation linked to protein sequence-based predictions. This nuanced approach to feature fusion is a key differentiator, enabling MetalTrans to outperform existing methods significantly, as evidenced by comparative analyses. Our evaluations across varied metal binding site data sets (specifically Zn, Ca, Mg, and Mix) underscore MetalTrans's superior performance, which achieved the average AUC values of 0.971, 0.965, 0.980, and 0.945 on multiple 5-fold cross-validation, respectively. Remarkably, against the multichannel convolutional neural network method on a benchmark independent test set, MetalTrans demonstrated unparalleled robustness and superiority, boasting the AUC score of 0.998 on multiple 5-fold cross-validation. Our comprehensive examination of the predicted outcomes further confirms the effectiveness of the model. The source codes, data sets, and prediction results for MetalTrans can be accessed for academic usage at https://github.com/EduardWang/MetalTrans.


Subject(s)
Metals , Mutation , Binding Sites , Metals/chemistry , Metals/metabolism , Humans , Proteins/chemistry , Proteins/genetics , Proteins/metabolism , Models, Molecular , Computational Biology/methods , Databases, Protein
11.
iScience ; 27(7): 110183, 2024 Jul 19.
Article in English | MEDLINE | ID: mdl-38989460

ABSTRACT

Current studies in early cancer detection based on liquid biopsy data often rely on off-the-shelf models and face challenges with heterogeneous data, as well as manually designed data preprocessing pipelines with different parameter settings. To address those challenges, we present AutoCancer, an automated, multimodal, and interpretable transformer-based framework. This framework integrates feature selection, neural architecture search, and hyperparameter optimization into a unified optimization problem with Bayesian optimization. Comprehensive experiments demonstrate that AutoCancer achieves accurate performance in specific cancer types and pan-cancer analysis, outperforming existing methods across three cohorts. We further demonstrated the interpretability of AutoCancer by identifying key gene mutations associated with non-small cell lung cancer to pinpoint crucial factors at different stages and subtypes. The robustness of AutoCancer, coupled with its strong interpretability, underscores its potential for clinical applications in early cancer detection.

12.
Int J Mol Sci ; 25(13)2024 Jun 27.
Article in English | MEDLINE | ID: mdl-39000124

ABSTRACT

Over the years, comprehensive explorations of the model organisms Caenorhabditis elegans (elegant worm) and Drosophila melanogaster (vinegar fly) have contributed substantially to our understanding of complex biological processes and pathways in multicellular organisms generally. Extensive functional genomic-phenomic, genomic, transcriptomic, and proteomic data sets have enabled the discovery and characterisation of genes that are crucial for life, called 'essential genes'. Recently, we investigated the feasibility of inferring essential genes from such data sets using advanced bioinformatics and showed that a machine learning (ML)-based workflow could be used to extract or engineer features from DNA, RNA, protein, and/or cellular data/information to underpin the reliable prediction of essential genes both within and between C. elegans and D. melanogaster. As these are two distantly related species within the Ecdysozoa, we proposed that this ML approach would be particularly well suited for species that are within the same phylum or evolutionary clade. In the present study, we cross-predicted essential genes within the phylum Nematoda (evolutionary clade V)-between C. elegans and the pathogenic parasitic nematode H. contortus-and then ranked and prioritised H. contortus proteins encoded by these genes as intervention (e.g., drug) target candidates. Using strong, validated predictors, we inferred essential genes of H. contortus that are involved predominantly in crucial biological processes/pathways including ribosome biogenesis, translation, RNA binding/processing, and signalling and which are highly transcribed in the germline, somatic gonad precursors, sex myoblasts, vulva cell precursors, various nerve cells, glia, or hypodermis. The findings indicate that this in silico workflow provides a promising avenue to identify and prioritise panels/groups of drug target candidates in parasitic nematodes for experimental validation in vitro and/or in vivo.


Subject(s)
Caenorhabditis elegans , Genes, Essential , Haemonchus , Machine Learning , Animals , Haemonchus/genetics , Caenorhabditis elegans/genetics , Helminth Proteins/genetics , Helminth Proteins/metabolism , Computational Biology/methods , Drosophila melanogaster/genetics
13.
Diabetes Care ; 47(9): 1664-1672, 2024 Sep 01.
Article in English | MEDLINE | ID: mdl-39012781

ABSTRACT

OBJECTIVE: To evaluate associations of wildfire fine particulate matter ≤2.5 mm in diameter (PM2.5) with diabetes across multiple countries and territories. RESEARCH DESIGN AND METHODS: We collected data on 3,612,135 diabetes hospitalizations from 1,008 locations in Australia, Brazil, Canada, Chile, New Zealand, Thailand, and Taiwan during 2000-2019. Daily wildfire-specific PM2.5 levels were estimated through chemical transport models and machine-learning calibration. Quasi-Poisson regression with distributed lag nonlinear models and random-effects meta-analysis were applied to estimate associations between wildfire-specific PM2.5 and diabetes hospitalization. Subgroup analyses were by age, sex, location income level, and country or territory. Diabetes hospitalizations attributable to wildfire-specific PM2.5 and nonwildfire PM2.5 were compared. RESULTS: Each 10 µg/m3 increase in wildfire-specific PM2.5 levels over the current day and previous 3 days was associated with relative risks (95% CI) of 1.017 (1.011-1.022), 1.023 (1.011-1.035), 1.023 (1.015-1.032), 0.962 (0.823-1.032), 1.033 (1.001-1.066), and 1.013 (1.004-1.022) for all-cause, type 1, type 2, malnutrition-related, other specified, and unspecified diabetes hospitalization, respectively. Stronger associations were observed for all-cause, type 1, and type 2 diabetes in Thailand, Australia, and Brazil; unspecified diabetes in New Zealand; and type 2 diabetes in high-income locations. An estimate of 0.67% (0.16-1.18%) and 1.02% (0.20-1.81%) for all-cause and type 2 diabetes hospitalizations were attributable to wildfire-specific PM2.5. Compared with nonwildfire PM2.5, wildfire-specific PM2.5 posed greater risks of all-cause, type 1, and type 2 diabetes and were responsible for 38.7% of PM2.5-related diabetes hospitalizations. CONCLUSIONS: We show the relatively underappreciated links between diabetes and wildfire air pollution, which can lead to a nonnegligible proportion of PM2.5-related diabetes hospitalizations. Precision prevention and mitigation should be developed for those in advantaged communities and in Thailand, Australia, and Brazil.


Subject(s)
Diabetes Mellitus , Hospitalization , Particulate Matter , Wildfires , Humans , Hospitalization/statistics & numerical data , Particulate Matter/analysis , Particulate Matter/adverse effects , Male , Australia/epidemiology , Middle Aged , Female , Diabetes Mellitus/epidemiology , Aged , Thailand/epidemiology , New Zealand/epidemiology , Brazil/epidemiology , Canada/epidemiology , Taiwan/epidemiology , Adult , Environmental Exposure/adverse effects , Environmental Exposure/statistics & numerical data
14.
Med Image Anal ; 97: 103252, 2024 Oct.
Article in English | MEDLINE | ID: mdl-38963973

ABSTRACT

Histopathology image-based survival prediction aims to provide a precise assessment of cancer prognosis and can inform personalized treatment decision-making in order to improve patient outcomes. However, existing methods cannot automatically model the complex correlations between numerous morphologically diverse patches in each whole slide image (WSI), thereby preventing them from achieving a more profound understanding and inference of the patient status. To address this, here we propose a novel deep learning framework, termed dual-stream multi-dependency graph neural network (DM-GNN), to enable precise cancer patient survival analysis. Specifically, DM-GNN is structured with the feature updating and global analysis branches to better model each WSI as two graphs based on morphological affinity and global co-activating dependencies. As these two dependencies depict each WSI from distinct but complementary perspectives, the two designed branches of DM-GNN can jointly achieve the multi-view modeling of complex correlations between the patches. Moreover, DM-GNN is also capable of boosting the utilization of dependency information during graph construction by introducing the affinity-guided attention recalibration module as the readout function. This novel module offers increased robustness against feature perturbation, thereby ensuring more reliable and stable predictions. Extensive benchmarking experiments on five TCGA datasets demonstrate that DM-GNN outperforms other state-of-the-art methods and offers interpretable prediction insights based on the morphological depiction of high-attention patches. Overall, DM-GNN represents a powerful and auxiliary tool for personalized cancer prognosis from histopathology images and has great potential to assist clinicians in making personalized treatment decisions and improving patient outcomes.


Subject(s)
Neural Networks, Computer , Humans , Survival Analysis , Deep Learning , Neoplasms/diagnostic imaging , Neoplasms/mortality , Image Interpretation, Computer-Assisted/methods , Prognosis
15.
Article in English | MEDLINE | ID: mdl-38913512

ABSTRACT

RNA N6-methyladenosine is a prevalent and abundant type of RNA modification that exerts significant influence on diverse biological processes. To date, numerous computational approaches have been developed for predicting methylation, with most of them ignoring the correlations of different encoding strategies and failing to explore the adaptability of various attention mechanisms for methylation identification. To solve the above issues, we proposed an innovative framework for predicting RNA m6A modification site, termed BLAM6A-Merge. Specifically, it utilized a multimodal feature fusion strategy to combine the classification results of four features and Blastn tool. Apart from this, different attention mechanisms were employed for extracting higher-level features on specific features after the screening process. Extensive experiments on 12 benchmarking datasets demonstrated that BLAM6A-Merge achieved superior performance (average AUC: 0.849 for the full transcript mode and 0.784 for the mature mRNA mode). Notably, the Blastn tool was employed for the first time in the identification of methylation sites. The data and code can be accessed at https://github.com/DoraemonXia/BLAM6A-Merge.

16.
Cell ; 187(13): 3357-3372.e19, 2024 Jun 20.
Article in English | MEDLINE | ID: mdl-38866018

ABSTRACT

Microbial hydrogen (H2) cycling underpins the diversity and functionality of diverse anoxic ecosystems. Among the three evolutionarily distinct hydrogenase superfamilies responsible, [FeFe] hydrogenases were thought to be restricted to bacteria and eukaryotes. Here, we show that anaerobic archaea encode diverse, active, and ancient lineages of [FeFe] hydrogenases through combining analysis of existing and new genomes with extensive biochemical experiments. [FeFe] hydrogenases are encoded by genomes of nine archaeal phyla and expressed by H2-producing Asgard archaeon cultures. We report an ultraminimal hydrogenase in DPANN archaea that binds the catalytic H-cluster and produces H2. Moreover, we identify and characterize remarkable hybrid complexes formed through the fusion of [FeFe] and [NiFe] hydrogenases in ten other archaeal orders. Phylogenetic analysis and structural modeling suggest a deep evolutionary history of hybrid hydrogenases. These findings reveal new metabolic adaptations of archaea, streamlined H2 catalysts for biotechnological development, and a surprisingly intertwined evolutionary history between the two major H2-metabolizing enzymes.


Subject(s)
Archaea , Hydrogen , Hydrogenase , Phylogeny , Archaea/genetics , Archaea/enzymology , Archaeal Proteins/metabolism , Archaeal Proteins/chemistry , Archaeal Proteins/genetics , Genome, Archaeal , Hydrogen/metabolism , Hydrogenase/metabolism , Hydrogenase/genetics , Hydrogenase/chemistry , Iron-Sulfur Proteins/metabolism , Iron-Sulfur Proteins/genetics , Iron-Sulfur Proteins/chemistry , Models, Molecular , Protein Structure, Tertiary
17.
Int J Epidemiol ; 53(3)2024 Apr 11.
Article in English | MEDLINE | ID: mdl-38725299

ABSTRACT

BACKGROUND: Model-estimated air pollution exposure products have been widely used in epidemiological studies to assess the health risks of particulate matter with diameters of ≤2.5 µm (PM2.5). However, few studies have assessed the disparities in health effects between model-estimated and station-observed PM2.5 exposures. METHODS: We collected daily all-cause, respiratory and cardiovascular mortality data in 347 cities across 15 countries and regions worldwide based on the Multi-City Multi-Country collaborative research network. The station-observed PM2.5 data were obtained from official monitoring stations. The model-estimated global PM2.5 product was developed using a machine-learning approach. The associations between daily exposure to PM2.5 and mortality were evaluated using a two-stage analytical approach. RESULTS: We included 15.8 million all-cause, 1.5 million respiratory and 4.5 million cardiovascular deaths from 2000 to 2018. Short-term exposure to PM2.5 was associated with a relative risk increase (RRI) of mortality from both station-observed and model-estimated exposures. Every 10-µg/m3 increase in the 2-day moving average PM2.5 was associated with overall RRIs of 0.67% (95% CI: 0.49 to 0.85), 0.68% (95% CI: -0.03 to 1.39) and 0.45% (95% CI: 0.08 to 0.82) for all-cause, respiratory, and cardiovascular mortality based on station-observed PM2.5 and RRIs of 0.87% (95% CI: 0.68 to 1.06), 0.81% (95% CI: 0.08 to 1.55) and 0.71% (95% CI: 0.32 to 1.09) based on model-estimated exposure, respectively. CONCLUSIONS: Mortality risks associated with daily PM2.5 exposure were consistent for both station-observed and model-estimated exposures, suggesting the reliability and potential applicability of the global PM2.5 product in epidemiological studies.


Subject(s)
Air Pollutants , Air Pollution , Cardiovascular Diseases , Cities , Environmental Exposure , Particulate Matter , Humans , Particulate Matter/adverse effects , Particulate Matter/analysis , Cardiovascular Diseases/mortality , Cities/epidemiology , Environmental Exposure/adverse effects , Air Pollution/adverse effects , Air Pollution/analysis , Air Pollutants/adverse effects , Air Pollutants/analysis , Respiratory Tract Diseases/mortality , Male , Mortality/trends , Female , Middle Aged , Aged , Environmental Monitoring/methods , Adult , Machine Learning
18.
BMC Med ; 22(1): 188, 2024 May 07.
Article in English | MEDLINE | ID: mdl-38715068

ABSTRACT

BACKGROUND: Floods are the most frequent weather-related disaster, causing significant health impacts worldwide. Limited studies have examined the long-term consequences of flooding exposure. METHODS: Flood data were retrieved from the Dartmouth Flood Observatory and linked with health data from 499,487 UK Biobank participants. To calculate the annual cumulative flooding exposure, we multiplied the duration and severity of each flood event and then summed these values for each year. We conducted a nested case-control analysis to evaluate the long-term effect of flooding exposure on all-cause and cause-specific mortality. Each case was matched with eight controls. Flooding exposure was modelled using a distributed lag non-linear model to capture its nonlinear and lagged effects. RESULTS: The risk of all-cause mortality increased by 6.7% (odds ratio (OR): 1.067, 95% confidence interval (CI): 1.063-1.071) for every unit increase in flood index after confounders had been controlled for. The mortality risk from neurological and mental diseases was negligible in the current year, but strongest in the lag years 3 and 4. By contrast, the risk of mortality from suicide was the strongest in the current year (OR: 1.018, 95% CI: 1.008-1.028), and attenuated to lag year 5. Participants with higher levels of education and household income had a higher estimated risk of death from most causes whereas the risk of suicide-related mortality was higher among participants who were obese, had lower household income, engaged in less physical activity, were non-moderate alcohol consumers, and those living in more deprived areas. CONCLUSIONS: Long-term exposure to floods is associated with an increased risk of mortality. The health consequences of flooding exposure would vary across different periods after the event, with different profiles of vulnerable populations identified for different causes of death. These findings contribute to a better understanding of the long-term impacts of flooding exposure.


Subject(s)
Floods , Humans , Floods/mortality , Case-Control Studies , United Kingdom/epidemiology , Male , Female , Aged , Middle Aged , Adult , Cause of Death , Risk Factors
19.
Cell Genom ; 4(6): 100565, 2024 Jun 12.
Article in English | MEDLINE | ID: mdl-38781966

ABSTRACT

Spatially resolved transcriptomics (SRT) technologies have revolutionized the study of tissue organization. We introduce a graph convolutional network with an attention and positive emphasis mechanism, termed BINARY, relying exclusively on binarized SRT data to accurately delineate spatial domains. BINARY outperforms existing methods across various SRT data types while using significantly less input information. Our study suggests that precise gene expression quantification may not always be essential, inspiring further exploration of the broader applications of spatially resolved binarized gene expression data.


Subject(s)
Gene Expression Profiling , Humans , Gene Expression Profiling/methods , Transcriptome/genetics , Algorithms
20.
Article in English | MEDLINE | ID: mdl-38607721

ABSTRACT

N4-acetylcytidine (ac4C) is a post-transcriptional modification in mRNA that is critical in mRNA translation in terms of stability and regulation. In the past few years, numerous approaches employing convolutional neural networks (CNN) and Transformer have been proposed for the identification of ac4C sites, with each variety of approaches processing distinct characteristics. CNN-based methods excels at extracting local features and positional information, whereas Transformer-based ones stands out in establishing long-range dependencies and generating global representations. Given the importance of both local and global features in mRNA ac4C sites identification, we propose a novel method termed TransC-ac4C which combines CNN and Transformer together for enhancing the feature extraction capability and improving the identification accuracy. Five different feature encoding strategies (One-hot, NCP, ND, EIIP, and K-mer) are employed to generate the mRNA sequence representations, in which way the sequence attributes and physical and chemical properties of the sequences can be embedded. To strengthen the relevance of features, we construct a novel feature fusion method. Firstly, the CNN is employed to process five single features, stitch them together and feed them to the Transformer layer. Then, our approach employs CNN to extract local features and Transformer subsequently to establish global long-range dependencies among extracted features. We use 5-fold cross-validation to evaluate the model, and the evaluation indicators are significantly improved. The prediction accuracy of the two datasets is as high as 81.42.

SELECTION OF CITATIONS
SEARCH DETAIL