Search | VHL Search Portal

Complementary multi-modality molecular self-supervised learning via non-overlapping masking for property prediction.

Shen, Ao; Yuan, Mingzhi; Ma, Yingfan; Du, Jie; Wang, Manning.

Brief Bioinform ; 25(4)2024 May 23.

Article in English | MEDLINE | ID: mdl-38801702

ABSTRACT

Self-supervised learning plays an important role in molecular representation learning because labeled molecular data are usually limited in many tasks, such as chemical property prediction and virtual screening. However, most existing molecular pre-training methods focus on one modality of molecular data, and the complementary information of two important modalities, SMILES and graph, is not fully explored. In this study, we propose an effective multi-modality self-supervised learning framework for molecular SMILES and graph. Specifically, SMILES data and graph data are first tokenized so that they can be processed by a unified Transformer-based backbone network, which is trained by a masked reconstruction strategy. In addition, we introduce a specialized non-overlapping masking strategy to encourage fine-grained interaction between these two modalities. Experimental results show that our framework achieves state-of-the-art performance in a series of molecular property prediction tasks, and a detailed ablation study demonstrates efficacy of the multi-modality framework and the masking strategy.

Subject(s)

Supervised Machine Learning , Algorithms , Computational Biology/methods

PGBind: pocket-guided explicit attention learning for protein-ligand docking.

Shen, Ao; Yuan, Mingzhi; Ma, Yingfan; Du, Jie; Wang, Manning.

Brief Bioinform ; 25(5)2024 Jul 25.

Article in English | MEDLINE | ID: mdl-39293803

ABSTRACT

As more and more protein structures are discovered, blind protein-ligand docking will play an important role in drug discovery because it can predict protein-ligand complex conformation without pocket information on the target proteins. Recently, deep learning-based methods have made significant advancements in blind protein-ligand docking, but their protein features are suboptimal because they do not fully consider the difference between potential pocket regions and non-pocket regions in protein feature extraction. In this work, we propose a pocket-guided strategy for guiding the ligand to dock to potential docking regions on a protein. To this end, we design a plug-and-play module to enhance the protein features, which can be directly incorporated into existing deep learning-based blind docking methods. The proposed module first estimates potential pocket regions on the target protein and then leverages a pocket-guided attention mechanism to enhance the protein features. Experiments are conducted on integrating our method with EquiBind and FABind, and the results show that their blind-docking performances are both significantly improved and new start-of-the-art performance is achieved by integration with FABind.

Subject(s)

Drug Discovery , Ligands , Proteins , Algorithms , Binding Sites , Computational Biology/methods , Deep Learning , Molecular Docking Simulation , Protein Binding , Protein Conformation , Proteins/chemistry , Proteins/metabolism

ProteinMAE: masked autoencoder for protein surface self-supervised learning.

Yuan, Mingzhi; Shen, Ao; Fu, Kexue; Guan, Jiaming; Ma, Yingfan; Qiao, Qin; Wang, Manning.

Bioinformatics ; 39(12)2023 12 01.

Article in English | MEDLINE | ID: mdl-38019955

ABSTRACT

SUMMARY: The biological functions of proteins are determined by the chemical and geometric properties of their surfaces. Recently, with the booming progress of deep learning, a series of learning-based surface descriptors have been proposed and achieved inspirational performance in many tasks such as protein design, protein-protein interaction prediction, etc. However, they are still limited by the problem of label scarcity, since the labels are typically obtained through wet experiments. Inspired by the great success of self-supervised learning in natural language processing and computer vision, we introduce ProteinMAE, a self-supervised framework specifically designed for protein surface representation to mitigate label scarcity. Specifically, we propose an efficient network and utilize a large number of accessible unlabeled protein data to pretrain it by self-supervised learning. Then we use the pretrained weights as initialization and fine-tune the network on downstream tasks. To demonstrate the effectiveness of our method, we conduct experiments on three different downstream tasks including binding site identification in protein surface, ligand-binding protein pocket classification, and protein-protein interaction prediction. The extensive experiments show that our method not only successfully improves the network's performance on all downstream tasks, but also achieves competitive performance with state-of-the-art methods. Moreover, our proposed network also exhibits significant advantages in terms of computational cost, which only requires less than a tenth of memory cost of previous methods. AVAILABILITY AND IMPLEMENTATION: https://github.com/phdymz/ProteinMAE.

Subject(s)

Membrane Proteins , Natural Language Processing , Binding Sites , Protein Domains , Supervised Machine Learning

Pressure-induced structural and electronic transitions of thiospinel Fe₃S₄.

Susilo, Resta A; Li, Guowei; Feng, Jiajia; Deng, Wen; Yuan, Mingzhi; Li, Shujia; Dong, Hongliang; Chen, Bin.

J Phys Condens Matter ; 31(9): 095401, 2019 Mar 06.

Article in English | MEDLINE | ID: mdl-30583290

ABSTRACT

We report the investigations on the structural and electronic properties of an inverse spinel Fe3S4 at high pressures using synchrotron x-ray diffraction (XRD) and electrical transport measurements. Our XRD measurements at high pressures reveal an irreversible structural phase transformation on compression above â¼3 GPa from a cubic spinel (Fd-3m space group) into a monoclinic Cr3S4-type structure (I2/m space group). Electrical transport measurements suggest that the high pressure monoclinic phase has a semiconducting behavior. This semiconducting behavior is found to persist up to the highest pressure of measurement of â¼23 GPa. These results show that while Fe3S4 possesses similar high pressure structural properties with other thiospinels, the electronic properties under pressure show a rather strong similarity to its oxide counterpart, Fe3O4, at high pressures.

Application of apparent diffusion coefficient and exponent apparent diffusion coefficient values in magnetic resonance imaging diffusion-weighted imaging to differentiate benign and malignant ovarian epithelial tumors.

Wang, Yu-Xing; Yuan, Ming-Zhi; Wen, Zhao-Xia.

J Cancer Res Ther ; 12(1): 401-5, 2016.

Article in English | MEDLINE | ID: mdl-27072270

ABSTRACT

OBJECTIVE: The aim of this study was to investigate the value of two quantitative indicators, the apparent diffusion coefficient (ADC) and the exponent apparent diffusion coefficient (EADC), of magnetic resonance imaging (MRI) diffusion-weighted imaging (DWI) in the differential diagnosis of ovarian epithelial tumors. MATERIALS AND METHODS: Clinical and MRI data from ovarian epithelial tumors were analyzed after pathology confirmation of 85 lesions from 76 cases (47 lesions from 41 benign cases; 38 lesions from 35 malignant cases). Patients underwent routine MRI examination and DWI before surgery. The average ADC and EADC values of the solid sections of the tumors were measured when the b value was 1000 s/mm 2. RESULTS: The mean ADC value of the solid sections in the benign group was 1.28 ± 0.23 × 10-3 mm 2/s; the average EADC value was 27.96 ± 5.78 × 10-2. In the malignant group, the mean ADC value of the solid sections was 0.86 ± 0.17 × 10-3 mm 2/s; the average EADC value was 42.37 ± 5.96 × 10-2. When the b value was 1000 s/mm 2, there was a statistically significant difference in ADC and EADC values between benign and malignant ovarian tumors (P < 0.05). CONCLUSION: ADC and EADC values of DWI can be used to differentiate benign and malignant ovarian epithelial tumors.

Subject(s)

Diffusion Magnetic Resonance Imaging/methods , Neoplasms, Glandular and Epithelial/diagnostic imaging , Neoplasms, Glandular and Epithelial/diagnosis , Ovarian Neoplasms/diagnostic imaging , Ovarian Neoplasms/diagnosis , Adolescent , Adult , Aged , Carcinoma, Ovarian Epithelial , Female , Humans , Middle Aged , Neoplasms, Glandular and Epithelial/pathology , Ovarian Neoplasms/pathology , ROC Curve

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL