Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 798
Filter
Add more filters

Publication year range
1.
Cell ; 187(12): 3024-3038.e14, 2024 Jun 06.
Article in English | MEDLINE | ID: mdl-38781969

ABSTRACT

Plants frequently encounter wounding and have evolved an extraordinary regenerative capacity to heal the wounds. However, the wound signal that triggers regenerative responses has not been identified. Here, through characterization of a tomato mutant defective in both wound-induced defense and regeneration, we demonstrate that in tomato, a plant elicitor peptide (Pep), REGENERATION FACTOR1 (REF1), acts as a systemin-independent local wound signal that primarily regulates local defense responses and regenerative responses in response to wounding. We further identified PEPR1/2 ORTHOLOG RECEPTOR-LIKE KINASE1 (PORK1) as the receptor perceiving REF1 signal for plant regeneration. REF1-PORK1-mediated signaling promotes regeneration via activating WOUND-INDUCED DEDIFFERENTIATION 1 (WIND1), a master regulator of wound-induced cellular reprogramming in plants. Thus, REF1-PORK1 signaling represents a conserved phytocytokine pathway to initiate, amplify, and stabilize a signaling cascade that orchestrates wound-triggered organ regeneration. Application of REF1 provides a simple method to boost the regeneration and transformation efficiency of recalcitrant crops.


Subject(s)
Plant Proteins , Regeneration , Signal Transduction , Solanum lycopersicum , Plant Proteins/metabolism , Plant Proteins/genetics , Solanum lycopersicum/metabolism , Gene Expression Regulation, Plant , Peptides/metabolism
2.
Cell ; 186(17): 3558-3576.e17, 2023 08 17.
Article in English | MEDLINE | ID: mdl-37562403

ABSTRACT

The most extreme environments are the most vulnerable to transformation under a rapidly changing climate. These ecosystems harbor some of the most specialized species, which will likely suffer the highest extinction rates. We document the steepest temperature increase (2010-2021) on record at altitudes of above 4,000 m, triggering a decline of the relictual and highly adapted moss Takakia lepidozioides. Its de-novo-sequenced genome with 27,467 protein-coding genes includes distinct adaptations to abiotic stresses and comprises the largest number of fast-evolving genes under positive selection. The uplift of the study site in the last 65 million years has resulted in life-threatening UV-B radiation and drastically reduced temperatures, and we detected several of the molecular adaptations of Takakia to these environmental changes. Surprisingly, specific morphological features likely occurred earlier than 165 mya in much warmer environments. Following nearly 400 million years of evolution and resilience, this species is now facing extinction.


Subject(s)
Bryophyta , Climate Change , Ecosystem , Acclimatization , Adaptation, Physiological , Tibet , Bryophyta/physiology
3.
EMBO J ; 42(21): e112963, 2023 11 02.
Article in English | MEDLINE | ID: mdl-37743772

ABSTRACT

The large intestine harbors microorganisms playing unique roles in host physiology. The beneficial or detrimental outcome of host-microbiome coexistence depends largely on the balance between regulators and responder intestinal CD4+ T cells. We found that ulcerative colitis-like changes in the large intestine after infection with the protist Blastocystis ST7 in a mouse model are associated with reduction of anti-inflammatory Treg cells and simultaneous expansion of pro-inflammatory Th17 responders. These alterations in CD4+ T cells depended on the tryptophan metabolite indole-3-acetaldehyde (I3AA) produced by this single-cell eukaryote. I3AA reduced the Treg subset in vivo and iTreg development in vitro by modifying their sensing of TGFß, concomitantly affecting recognition of self-flora antigens by conventional CD4+ T cells. Parasite-derived I3AA also induces over-exuberant TCR signaling, manifested by increased CD69 expression and downregulation of co-inhibitor PD-1. We have thus identified a new mechanism dictating CD4+ fate decisions. The findings thus shine a new light on the ability of the protist microbiome and tryptophan metabolites, derived from them or other sources, to modulate the adaptive immune compartment, particularly in the context of gut inflammatory disorders.


Subject(s)
Gastrointestinal Microbiome , Microbiota , Animals , Mice , Eukaryota/metabolism , Tryptophan/metabolism , T-Lymphocytes, Regulatory
4.
Brief Bioinform ; 25(3)2024 Mar 27.
Article in English | MEDLINE | ID: mdl-38754409

ABSTRACT

Drug repurposing offers a viable strategy for discovering new drugs and therapeutic targets through the analysis of drug-gene interactions. However, traditional experimental methods are plagued by their costliness and inefficiency. Despite graph convolutional network (GCN)-based models' state-of-the-art performance in prediction, their reliance on supervised learning makes them vulnerable to data sparsity, a common challenge in drug discovery, further complicating model development. In this study, we propose SGCLDGA, a novel computational model leveraging graph neural networks and contrastive learning to predict unknown drug-gene associations. SGCLDGA employs GCNs to extract vector representations of drugs and genes from the original bipartite graph. Subsequently, singular value decomposition (SVD) is employed to enhance the graph and generate multiple views. The model performs contrastive learning across these views, optimizing vector representations through a contrastive loss function to better distinguish positive and negative samples. The final step involves utilizing inner product calculations to determine association scores between drugs and genes. Experimental results on the DGIdb4.0 dataset demonstrate SGCLDGA's superior performance compared with six state-of-the-art methods. Ablation studies and case analyses validate the significance of contrastive learning and SVD, highlighting SGCLDGA's potential in discovering new drug-gene associations. The code and dataset for SGCLDGA are freely available at https://github.com/one-melon/SGCLDGA.


Subject(s)
Neural Networks, Computer , Humans , Drug Repositioning/methods , Computational Biology/methods , Algorithms , Software , Drug Discovery/methods , Machine Learning
5.
Brief Bioinform ; 25(4)2024 May 23.
Article in English | MEDLINE | ID: mdl-38920342

ABSTRACT

Effective molecular representation learning is very important for Artificial Intelligence-driven Drug Design because it affects the accuracy and efficiency of molecular property prediction and other molecular modeling relevant tasks. However, previous molecular representation learning studies often suffer from limitations, such as over-reliance on a single molecular representation, failure to fully capture both local and global information in molecular structure, and ineffective integration of multiscale features from different molecular representations. These limitations restrict the complete and accurate representation of molecular structure and properties, ultimately impacting the accuracy of predicting molecular properties. To this end, we propose a novel multi-view molecular representation learning method called MvMRL, which can incorporate feature information from multiple molecular representations and capture both local and global information from different views well, thus improving molecular property prediction. Specifically, MvMRL consists of four parts: a multiscale CNN-SE Simplified Molecular Input Line Entry System (SMILES) learning component and a multiscale Graph Neural Network encoder to extract local feature information and global feature information from the SMILES view and the molecular graph view, respectively; a Multi-Layer Perceptron network to capture complex non-linear relationship features from the molecular fingerprint view; and a dual cross-attention component to fuse feature information on the multi-views deeply for predicting molecular properties. We evaluate the performance of MvMRL on 11 benchmark datasets, and experimental results show that MvMRL outperforms state-of-the-art methods, indicating its rationality and effectiveness in molecular property prediction. The source code of MvMRL was released in https://github.com/jedison-github/MvMRL.


Subject(s)
Neural Networks, Computer , Algorithms , Machine Learning , Models, Molecular , Drug Design , Software , Molecular Structure , Artificial Intelligence
6.
Brief Bioinform ; 25(3)2024 Mar 27.
Article in English | MEDLINE | ID: mdl-38647155

ABSTRACT

Accurately delineating the connection between short nucleolar RNA (snoRNA) and disease is crucial for advancing disease detection and treatment. While traditional biological experimental methods are effective, they are labor-intensive, costly and lack scalability. With the ongoing progress in computer technology, an increasing number of deep learning techniques are being employed to predict snoRNA-disease associations. Nevertheless, the majority of these methods are black-box models, lacking interpretability and the capability to elucidate the snoRNA-disease association mechanism. In this study, we introduce IGCNSDA, an innovative and interpretable graph convolutional network (GCN) approach tailored for the efficient inference of snoRNA-disease associations. IGCNSDA leverages the GCN framework to extract node feature representations of snoRNAs and diseases from the bipartite snoRNA-disease graph. SnoRNAs with high similarity are more likely to be linked to analogous diseases, and vice versa. To facilitate this process, we introduce a subgraph generation algorithm that effectively groups similar snoRNAs and their associated diseases into cohesive subgraphs. Subsequently, we aggregate information from neighboring nodes within these subgraphs, iteratively updating the embeddings of snoRNAs and diseases. The experimental results demonstrate that IGCNSDA outperforms the most recent, highly relevant methods. Additionally, our interpretability analysis provides compelling evidence that IGCNSDA adeptly captures the underlying similarity between snoRNAs and diseases, thus affording researchers enhanced insights into the snoRNA-disease association mechanism. Furthermore, we present illustrative case studies that demonstrate the utility of IGCNSDA as a valuable tool for efficiently predicting potential snoRNA-disease associations. The dataset and source code for IGCNSDA are openly accessible at: https://github.com/altriavin/IGCNSDA.


Subject(s)
RNA, Small Nucleolar , RNA, Small Nucleolar/genetics , Humans , Algorithms , Computational Biology/methods , Neural Networks, Computer , Software , Deep Learning
7.
Plant Cell ; 35(3): 1038-1057, 2023 03 15.
Article in English | MEDLINE | ID: mdl-36471914

ABSTRACT

Fruit ripening relies on the precise spatiotemporal control of RNA polymerase II (Pol II)-dependent gene transcription, and the evolutionarily conserved Mediator (MED) coactivator complex plays an essential role in this process. In tomato (Solanum lycopersicum), a model climacteric fruit, ripening is tightly coordinated by ethylene and several key transcription factors. However, the mechanism underlying the transmission of context-specific regulatory signals from these ripening-related transcription factors to the Pol II transcription machinery remains unknown. Here, we report the mechanistic function of MED25, a subunit of the plant Mediator transcriptional coactivator complex, in controlling the ethylene-mediated transcriptional program during fruit ripening. Multiple lines of evidence indicate that MED25 physically interacts with the master transcription factors of the ETHYLENE-INSENSITIVE 3 (EIN3)/EIN3-LIKE (EIL) family, thereby playing an essential role in pre-initiation complex formation during ethylene-induced gene transcription. We also show that MED25 forms a transcriptional module with EIL1 to regulate the expression of ripening-related regulatory as well as structural genes through promoter binding. Furthermore, the EIL1-MED25 module orchestrates both positive and negative feedback transcriptional circuits, along with its downstream regulators, to fine-tune ethylene homeostasis during fruit ripening.


Subject(s)
Solanum lycopersicum , Transcription Factors , Transcription Factors/genetics , Transcription Factors/metabolism , Solanum lycopersicum/genetics , Fruit/metabolism , Plant Proteins/genetics , Plant Proteins/metabolism , Ethylenes/metabolism , Gene Expression Regulation, Plant
8.
Nature ; 582(7813): 501-505, 2020 06.
Article in English | MEDLINE | ID: mdl-32541968

ABSTRACT

Quantum key distribution (QKD)1-3 is a theoretically secure way of sharing secret keys between remote users. It has been demonstrated in a laboratory over a coiled optical fibre up to 404 kilometres long4-7. In the field, point-to-point QKD has been achieved from a satellite to a ground station up to 1,200 kilometres away8-10. However, real-world QKD-based cryptography targets physically separated users on the Earth, for which the maximum distance has been about 100 kilometres11,12. The use of trusted relays can extend these distances from across a typical metropolitan area13-16 to intercity17 and even intercontinental distances18. However, relays pose security risks, which can be avoided by using entanglement-based QKD, which has inherent source-independent security19,20. Long-distance entanglement distribution can be realized using quantum repeaters21, but the related technology is still immature for practical implementations22. The obvious alternative for extending the range of quantum communication without compromising its security is satellite-based QKD, but so far satellite-based entanglement distribution has not been efficient23 enough to support QKD. Here we demonstrate entanglement-based QKD between two ground stations separated by 1,120 kilometres at a finite secret-key rate of 0.12 bits per second, without the need for trusted relays. Entangled photon pairs were distributed via two bidirectional downlinks from the Micius satellite to two ground observatories in Delingha and Nanshan in China. The development of a high-efficiency telescope and follow-up optics crucially improved the link efficiency. The generated keys are secure for realistic devices, because our ground receivers were carefully designed to guarantee fair sampling and immunity to all known side channels24,25. Our method not only increases the secure distance on the ground tenfold but also increases the practical security of QKD to an unprecedented level.

9.
Proc Natl Acad Sci U S A ; 120(13): e2210796120, 2023 03 28.
Article in English | MEDLINE | ID: mdl-36947513

ABSTRACT

Rewiring of redox metabolism has a profound impact on tumor development, but how the cellular heterogeneity of redox balance affects leukemogenesis remains unknown. To precisely characterize the dynamic change in redox metabolism in vivo, we developed a bright genetically encoded biosensor for H2O2 (named HyPerion) and tracked the redox state of leukemic cells in situ in a transgenic sensor mouse. A H2O2-low (HyPerion-low) subset of acute myeloid leukemia (AML) cells was enriched with leukemia-initiating cells, which were endowed with high colony-forming ability, potent drug resistance, endosteal rather than vascular localization, and short survival. Significantly high expression of malic enzymes, including ME1/3, accounted for nicotinamide adenine dinucleotide phosphate (NADPH) production and the subsequent low abundance of H2O2. Deletion of malic enzymes decreased the population size of leukemia-initiating cells and impaired their leukemogenic capacity and drug resistance. In summary, by establishing an in vivo redox monitoring tool at single-cell resolution, this work reveals a critical role of redox metabolism in leukemogenesis and a potential therapeutic target.


Subject(s)
Hydrogen Peroxide , Leukemia, Myeloid, Acute , Mice , Animals , Leukemia, Myeloid, Acute/drug therapy , Leukemia, Myeloid, Acute/genetics , Leukemia, Myeloid, Acute/metabolism , Oxidation-Reduction , Mice, Transgenic , Drug Resistance, Neoplasm/genetics
10.
Brief Bioinform ; 24(4)2023 07 20.
Article in English | MEDLINE | ID: mdl-37401369

ABSTRACT

As the volume of protein sequence and structure data grows rapidly, the functions of the overwhelming majority of proteins cannot be experimentally determined. Automated annotation of protein function at a large scale is becoming increasingly important. Existing computational prediction methods are typically based on expanding the relatively small number of experimentally determined functions to large collections of proteins with various clues, including sequence homology, protein-protein interaction, gene co-expression, etc. Although there has been some progress in protein function prediction in recent years, the development of accurate and reliable solutions still has a long way to go. Here we exploit AlphaFold predicted three-dimensional structural information, together with other non-structural clues, to develop a large-scale approach termed PredGO to annotate Gene Ontology (GO) functions for proteins. We use a pre-trained language model, geometric vector perceptrons and attention mechanisms to extract heterogeneous features of proteins and fuse these features for function prediction. The computational results demonstrate that the proposed method outperforms other state-of-the-art approaches for predicting GO functions of proteins in terms of both coverage and accuracy. The improvement of coverage is because the number of structures predicted by AlphaFold is greatly increased, and on the other hand, PredGO can extensively use non-structural information for functional prediction. Moreover, we show that over 205 000 ($\sim $100%) entries in UniProt for human are annotated by PredGO, over 186 000 ($\sim $90%) of which are based on predicted structure. The webserver and database are available at http://predgo.denglab.org/.


Subject(s)
Computational Biology , Proteins , Humans , Computational Biology/methods , Proteins/chemistry , Amino Acid Sequence , Neural Networks, Computer , Databases, Factual , Databases, Protein
11.
Brief Bioinform ; 24(6)2023 09 22.
Article in English | MEDLINE | ID: mdl-37985451

ABSTRACT

Non-coding RNAs (ncRNAs) play a critical role in the occurrence and development of numerous human diseases. Consequently, studying the associations between ncRNAs and diseases has garnered significant attention from researchers in recent years. Various computational methods have been proposed to explore ncRNA-disease relationships, with Graph Neural Network (GNN) emerging as a state-of-the-art approach for ncRNA-disease association prediction. In this survey, we present a comprehensive review of GNN-based models for ncRNA-disease associations. Firstly, we provide a detailed introduction to ncRNAs and GNNs. Next, we delve into the motivations behind adopting GNNs for predicting ncRNA-disease associations, focusing on data structure, high-order connectivity in graphs and sparse supervision signals. Subsequently, we analyze the challenges associated with using GNNs in predicting ncRNA-disease associations, covering graph construction, feature propagation and aggregation, and model optimization. We then present a detailed summary and performance evaluation of existing GNN-based models in the context of ncRNA-disease associations. Lastly, we explore potential future research directions in this rapidly evolving field. This survey serves as a valuable resource for researchers interested in leveraging GNNs to uncover the complex relationships between ncRNAs and diseases.


Subject(s)
Neural Networks, Computer , RNA, Untranslated , Humans , RNA, Untranslated/genetics , Research Personnel
12.
Brief Bioinform ; 24(2)2023 03 19.
Article in English | MEDLINE | ID: mdl-36781207

ABSTRACT

Post-translational modifications (PTMs) fine-tune various signaling pathways not only by the modification of a single residue, but also by the interplay of different modifications on residue pairs within or between proteins, defined as PTM cross-talk. As a challenging question, less attention has been given to PTM dynamics underlying cross-talk residue pairs and structural information underlying protein-protein interaction (PPI) graph, limiting the progress in this PTM functional research. Here we propose a novel integrated deep neural network PPICT (Predictor for PTM Inter-protein Cross-Talk), which predicts PTM cross-talk by combining protein sequence-structure-dynamics information and structural information for PPI graph. We find that cross-talk events preferentially occur among residues with high co-evolution and high potential in allosteric regulation. To make full use of the complex associations between protein evolutionary and biophysical features, and protein pair features, a heterogeneous feature combination net is introduced in the final prediction of PPICT. The comprehensive test results show that the proposed PPICT method significantly improves the prediction performance with an AUC value of 0.869, outperforming the existing state-of-the-art methods. Additionally, the PPICT method can capture the potential PTM cross-talks involved in the functional regulatory PTMs on modifying enzymes and their catalyzed PTM substrates. Therefore, PPICT represents an effective tool for identifying PTM cross-talk between proteins at the proteome level and highlights the hints for cross-talk between different signal pathways introduced by PTMs.


Subject(s)
Neural Networks, Computer , Protein Processing, Post-Translational , Proteome/metabolism , Signal Transduction , Protein Domains
13.
Bioinformatics ; 40(Suppl 2): ii190-ii197, 2024 09 01.
Article in English | MEDLINE | ID: mdl-39230706

ABSTRACT

MOTIVATION: Effective molecular representation is critical in drug development. The complex nature of molecules demands comprehensive multi-view representations, considering 1D, 2D, and 3D aspects, to capture diverse perspectives. Obtaining representations that encompass these varied structures is crucial for a holistic understanding of molecules in drug-related contexts. RESULTS: In this study, we introduce an innovative multi-view contrastive learning framework for molecular representation, denoted as MolMVC. Initially, we use a Transformer encoder to capture 1D sequence information and a Graph Transformer to encode the intricate 2D and 3D structural details of molecules. Our approach incorporates a novel attention-guided augmentation scheme, leveraging prior knowledge to create positive samples tailored to different molecular data views. To align multi-view molecular positive samples effectively in latent space, we introduce an adaptive multi-view contrastive loss (AMCLoss). In particular, we calculate AMCLoss at various levels within the model to effectively capture the hierarchical nature of the molecular information. Eventually, we pre-train the encoders via minimizing AMCLoss to obtain the molecular representation, which can be used for various down-stream tasks. In our experiments, we evaluate the performance of our MolMVC on multiple tasks, including molecular property prediction (MPP), drug-target binding affinity (DTA) prediction and cancer drug response (CDR) prediction. The results demonstrate that the molecular representation learned by our MolMVC can enhance the predictive accuracy on these tasks and also reduce the computational costs. Furthermore, we showcase MolMVC's efficacy in drug repositioning across a spectrum of drug-related applications. AVAILABILITY AND IMPLEMENTATION: The code and pre-trained model are publicly available at https://github.com/Hhhzj-7/MolMVC.


Subject(s)
Machine Learning , Algorithms , Computational Biology/methods , Pharmaceutical Preparations/chemistry
14.
PLoS Biol ; 20(8): e3001516, 2022 08.
Article in English | MEDLINE | ID: mdl-36026438

ABSTRACT

Triglycerides are carried in the bloodstream as part of very low-density lipoproteins (VLDLs) and chylomicrons, which represent the triglyceride-rich lipoproteins. Triglyceride-rich lipoproteins and their remnants contribute to atherosclerosis, possibly by carrying remnant cholesterol and/or by exerting a proinflammatory effect on macrophages. Nevertheless, little is known about how macrophages process triglyceride-rich lipoproteins. Here, using VLDL-sized triglyceride-rich emulsion particles, we aimed to study the mechanism by which VLDL triglycerides are taken up, processed, and stored in macrophages. Our results show that macrophage uptake of VLDL-sized emulsion particles is dependent on lipoprotein lipase (LPL) and requires the lipoprotein-binding C-terminal domain but not the catalytic N-terminal domain of LPL. Subsequent internalization of VLDL-sized emulsion particles by macrophages is carried out by caveolae-mediated endocytosis, followed by triglyceride hydrolysis catalyzed by lysosomal acid lipase. It is shown that STARD3 is required for the transfer of lysosomal fatty acids to the ER for subsequent storage as triglycerides, while NPC1 likely is involved in promoting the extracellular efflux of fatty acids from lysosomes. Our data provide novel insights into how macrophages process VLDL triglycerides and suggest that macrophages have the remarkable capacity to excrete part of the internalized triglycerides as fatty acids.


Subject(s)
Caveolae , Fatty Acids , Emulsions , Endocytosis , Lipoproteins , Macrophages , Triglycerides
15.
Nature ; 572(7767): 106-111, 2019 08.
Article in English | MEDLINE | ID: mdl-31367028

ABSTRACT

There are two general approaches to developing artificial general intelligence (AGI)1: computer-science-oriented and neuroscience-oriented. Because of the fundamental differences in their formulations and coding schemes, these two approaches rely on distinct and incompatible platforms2-8, retarding the development of AGI. A general platform that could support the prevailing computer-science-based artificial neural networks as well as neuroscience-inspired models and algorithms is highly desirable. Here we present the Tianjic chip, which integrates the two approaches to provide a hybrid, synergistic platform. The Tianjic chip adopts a many-core architecture, reconfigurable building blocks and a streamlined dataflow with hybrid coding schemes, and can not only accommodate computer-science-based machine-learning algorithms, but also easily implement brain-inspired circuits and several coding schemes. Using just one chip, we demonstrate the simultaneous processing of versatile algorithms and models in an unmanned bicycle system, realizing real-time object detection, tracking, voice control, obstacle avoidance and balance control. Our study is expected to stimulate AGI development by paving the way to more generalized hardware platforms.

16.
Proc Natl Acad Sci U S A ; 119(12): e2114739119, 2022 03 22.
Article in English | MEDLINE | ID: mdl-35302892

ABSTRACT

In response to inflammatory activation by pathogens, macrophages accumulate triglycerides in intracellular lipid droplets. The mechanisms underlying triglyceride accumulation and its exact role in the inflammatory response of macrophages are not fully understood. Here, we aim to further elucidate the mechanism and function of triglyceride accumulation in the inflammatory response of activated macrophages. Lipopolysaccharide (LPS)-mediated activation markedly increased triglyceride accumulation in macrophages. This increase could be attributed to up-regulation of the hypoxia-inducible lipid droplet­associated (HILPDA) protein, which down-regulated adipose triglyceride lipase (ATGL) protein levels, in turn leading to decreased ATGL-mediated triglyceride hydrolysis. The reduction in ATGL-mediated lipolysis attenuated the inflammatory response in macrophages after ex vivo and in vitro activation, and was accompanied by decreased production of prostaglandin-E2 (PGE2) and interleukin-6 (IL-6). Overall, we provide evidence that LPS-mediated activation of macrophages suppresses lipolysis via induction of HILPDA, thereby reducing the availability of proinflammatory lipid precursors and suppressing the production of PGE2 and IL-6.


Subject(s)
Lipid Droplets , Lipid Metabolism , Humans , Inflammation/metabolism , Lipid Droplets/metabolism , Lipids , Macrophages/metabolism , Neoplasm Proteins/metabolism , Triglycerides/metabolism
17.
BMC Genomics ; 25(1): 839, 2024 Sep 06.
Article in English | MEDLINE | ID: mdl-39243028

ABSTRACT

BACKGROUND: The postharvest rot of kiwifruit is one of the most devastating diseases affecting kiwifruit quality worldwide. However, the genomic basis and pathogenicity mechanisms of kiwifruit rot pathogens are lacking. Here we report the first whole genome sequence of Pestalotiopsis microspora, one of the main pathogens causing postharvest kiwifruit rot in China. The genome of strain KFRD-2 was sequenced, de novo assembled, and analyzed. RESULTS: The genome of KFRD-2 was estimated to be approximately 50.31 Mb in size, with an overall GC content of 50.25%. Among 14,711 predicted genes, 14,423 (98.04%) exhibited significant matches to genes in the NCBI nr database. A phylogenetic analysis of 26 known pathogenic fungi, including P. microspora KFRD-2, based on conserved orthologous genes, revealed that KFRD-2's closest evolutionary relationships were to Neopestalotiopsis spp. Among KFRD-2's coding genes, 870 putative CAZy genes spanned six classes of CAZys, which play roles in degrading plant cell walls. Out of the 25 other plant pathogenic fungi, P. microspora possessed a greater number of CAZy genes than 22 and was especially enriched in GH and AA genes. A total of 845 transcription factors and 86 secondary metabolism gene clusters were predicted, representing various types. Furthermore, 28 effectors and 109 virulence-enhanced factors were identified using the PHI (pathogen host-interacting) database. CONCLUSION: This complete genome sequence analysis of the kiwifruit postharvest rot pathogen P. microspora enriches our understanding its disease pathogenesis and virulence. This study establishes a theoretical foundation for future investigations into the pathogenic mechanisms of P. microspora and the development of enhanced strategies for the efficient management of kiwifruit postharvest rots.


Subject(s)
Actinidia , Phylogeny , Plant Diseases , Whole Genome Sequencing , Actinidia/microbiology , Plant Diseases/microbiology , Genome, Fungal , Fruit/microbiology
18.
Brief Bioinform ; 23(4)2022 07 18.
Article in English | MEDLINE | ID: mdl-35753699

ABSTRACT

MOTIVATION: The interplay between protein and nucleic acid participates in diverse biological activities. Accurately identifying the interaction between protein and nucleic acid can strengthen the understanding of protein function. However, conventional methods are too time-consuming, and computational methods are type-agnostic predictions. We proposed an ensemble predictor termed TSNAPred and first used it to identify residues that bind to A-DNA, B-DNA, ssDNA, mRNA, tRNA and rRNA. TSNAPred combines LightGBM and capsule network, both learned on the feature derived from protein sequence. TSNAPred utilizes the sliding window technique to extract long-distance dependencies between residues and a weighted ensemble strategy to enhance the prediction performance. The results show that TSNAPred can effectively identify type-specific nucleic acid binding residues in our test set. What is more, it also can discriminate DNA-binding and RNA-binding residues, which has improved 5% to 10% on the AUC value compared with other state-of-the-art methods. The dataset and code of TSNAPred are available at: https://github.com/niewenjuan-csu/TSNAPred.


Subject(s)
Computational Biology , Nucleic Acids , Algorithms , Amino Acid Sequence , Computational Biology/methods , Nucleic Acids/metabolism , Protein Binding , Proteins/metabolism
19.
Brief Bioinform ; 23(5)2022 09 20.
Article in English | MEDLINE | ID: mdl-35940592

ABSTRACT

MOTIVATION: Accurate and efficient prediction of the molecular property is one of the fundamental problems in drug research and development. Recent advancements in representation learning have been shown to greatly improve the performance of molecular property prediction. However, due to limited labeled data, supervised learning-based molecular representation algorithms can only search limited chemical space and suffer from poor generalizability. RESULTS: In this work, we proposed a self-supervised learning method, ATMOL, for molecular representation learning and properties prediction. We developed a novel molecular graph augmentation strategy, referred to as attention-wise graph masking, to generate challenging positive samples for contrastive learning. We adopted the graph attention network as the molecular graph encoder, and leveraged the learned attention weights as masking guidance to generate molecular augmentation graphs. By minimization of the contrastive loss between original graph and augmented graph, our model can capture important molecular structure and higher order semantic information. Extensive experiments showed that our attention-wise graph mask contrastive learning exhibited state-of-the-art performance in a couple of downstream molecular property prediction tasks. We also verified that our model pretrained on larger scale of unlabeled data improved the generalization of learned molecular representation. Moreover, visualization of the attention heatmaps showed meaningful patterns indicative of atoms and atomic groups important to specific molecular property.


Subject(s)
Algorithms , Semantics
20.
Brief Bioinform ; 23(1)2022 01 17.
Article in English | MEDLINE | ID: mdl-34571537

ABSTRACT

MOTIVATION: Drug combination therapy has become an increasingly promising method in the treatment of cancer. However, the number of possible drug combinations is so huge that it is hard to screen synergistic drug combinations through wet-lab experiments. Therefore, computational screening has become an important way to prioritize drug combinations. Graph neural network has recently shown remarkable performance in the prediction of compound-protein interactions, but it has not been applied to the screening of drug combinations. RESULTS: In this paper, we proposed a deep learning model based on graph neural network and attention mechanism to identify drug combinations that can effectively inhibit the viability of specific cancer cells. The feature embeddings of drug molecule structure and gene expression profiles were taken as input to multilayer feedforward neural network to identify the synergistic drug combinations. We compared DeepDDS (Deep Learning for Drug-Drug Synergy prediction) with classical machine learning methods and other deep learning-based methods on benchmark data set, and the leave-one-out experimental results showed that DeepDDS achieved better performance than competitive methods. Also, on an independent test set released by well-known pharmaceutical enterprise AstraZeneca, DeepDDS was superior to competitive methods by more than 16% predictive precision. Furthermore, we explored the interpretability of the graph attention network and found the correlation matrix of atomic features revealed important chemical substructures of drugs. We believed that DeepDDS is an effective tool that prioritized synergistic drug combinations for further wet-lab experiment validation. AVAILABILITY AND IMPLEMENTATION: Source code and data are available at https://github.com/Sinwang404/DeepDDS/tree/master.


Subject(s)
Neoplasms , Neural Networks, Computer , Drug Combinations , Humans , Machine Learning , Neoplasms/drug therapy , Software
SELECTION OF CITATIONS
SEARCH DETAIL