Búsqueda | Portal Regional de la BVS

1.

GRACE: Unveiling Gene Regulatory Networks With Causal Mechanistic Graph Neural Networks in Single-Cell RNA-Sequencing Data.

Wang, Jia-Cheng; Chen, Yao-Jia; Zou, Quan.

IEEE Trans Neural Netw Learn Syst ; PP2024 Jun 19.

Artículo en Inglés | MEDLINE | ID: mdl-38896510

RESUMEN

Reconstructing gene regulatory networks (GRNs) using single-cell RNA sequencing (scRNA-seq) data holds great promise for unraveling cellular fate development and heterogeneity. While numerous machine-learning methods have been proposed to infer GRNs from scRNA-seq gene expression data, many of them operate solely in a statistical or black box manner, limiting their capacity for making causal inferences between genes. In this study, we introduce GRN inference with Accuracy and Causal Explanation (GRACE), a novel graph-based causal autoencoder framework that combines a structural causal model (SCM) with graph neural networks (GNNs) to enable GRN inference and gene causal reasoning from scRNA-seq data. By explicitly modeling causal relationships between genes, GRACE facilitates the learning of regulatory context and gene embeddings. With the learned gene signals, our model successfully decoding the causal structures and alleviates the accurate determination of multiple attributes of gene regulation that is important to determine the regulatory levels. Through extensive evaluations on seven benchmarks, we demonstrate that GRACE outperforms 14 state-of-the-art GRN inference methods, with the incorporation of causal mechanisms significantly enhancing the accuracy of GRN and gene causality inference. Furthermore, the application to human peripheral blood mononuclear cell (PBMC) samples reveals cell type-specific regulators in monocyte phagocytosis and immune regulation, validated through network analysis and functional enrichment analysis.

2.

Sequence homology score-based deep fuzzy network for identifying therapeutic peptides.

Guo, Xiaoyi; Zheng, Ziyu; Cheong, Kang Hao; Zou, Quan; Tiwari, Prayag; Ding, Yijie.

Neural Netw ; 178: 106458, 2024 Jun 10.

Artículo en Inglés | MEDLINE | ID: mdl-38901093

RESUMEN

The detection of therapeutic peptides is a topic of immense interest in the biomedical field. Conventional biochemical experiment-based detection techniques are tedious and time-consuming. Computational biology has become a useful tool for improving the detection efficiency of therapeutic peptides. Most computational methods do not consider the deviation caused by noise. To improve the generalization performance of therapeutic peptide prediction methods, this work presents a sequence homology score-based deep fuzzy echo-state network with maximizing mixture correntropy (SHS-DFESN-MMC) model. Our method is compared with the existing methods on eight types of therapeutic peptide datasets. The model parameters are determined by 10 fold cross-validation on their training sets and verified by independent test sets. Across the 8 datasets, the average area under the receiver operating characteristic curve (AUC) values of SHS-DFESN-MMC are the highest on both the training (0.926) and independent sets (0.923).

3.

Drug-target interaction predictions with multi-view similarity network fusion strategy and deep interactive attention mechanism.

Song, Wei; Xu, Lewen; Han, Chenguang; Tian, Zhen; Zou, Quan.

Bioinformatics ; 40(6)2024 Jun 03.

Artículo en Inglés | MEDLINE | ID: mdl-38837345

RESUMEN

MOTIVATION: Accurately identifying the drug-target interactions (DTIs) is one of the crucial steps in the drug discovery and drug repositioning process. Currently, many computational-based models have already been proposed for DTI prediction and achieved some significant improvement. However, these approaches pay little attention to fuse the multi-view similarity networks related to drugs and targets in an appropriate way. Besides, how to fully incorporate the known interaction relationships to accurately represent drugs and targets is not well investigated. Therefore, there is still a need to improve the accuracy of DTI prediction models. RESULTS: In this study, we propose a novel approach that employs Multi-view similarity network fusion strategy and deep Interactive attention mechanism to predict Drug-Target Interactions (MIDTI). First, MIDTI constructs multi-view similarity networks of drugs and targets with their diverse information and integrates these similarity networks effectively in an unsupervised manner. Then, MIDTI obtains the embeddings of drugs and targets from multi-type networks simultaneously. After that, MIDTI adopts the deep interactive attention mechanism to further learn their discriminative embeddings comprehensively with the known DTI relationships. Finally, we feed the learned representations of drugs and targets to the multilayer perceptron model and predict the underlying interactions. Extensive results indicate that MIDTI significantly outperforms other baseline methods on the DTI prediction task. The results of the ablation experiments also confirm the effectiveness of the attention mechanism in the multi-view similarity network fusion strategy and the deep interactive attention mechanism. AVAILABILITY AND IMPLEMENTATION: https://github.com/XuLew/MIDTI.

Asunto(s)

Biología Computacional , Biología Computacional/métodos , Descubrimiento de Drogas/métodos , Algoritmos , Reposicionamiento de Medicamentos/métodos , Preparaciones Farmacéuticas/metabolismo , Preparaciones Farmacéuticas/química , Humanos

4.

A comprehensive survey of dimensionality reduction and clustering methods for single-cell and spatial transcriptomics data.

Sun, Yidi; Kong, Lingling; Huang, Jiayi; Deng, Hongyan; Bian, Xinling; Li, Xingfeng; Cui, Feifei; Dou, Lijun; Cao, Chen; Zou, Quan; Zhang, Zilong.

Brief Funct Genomics ; 2024 Jun 11.

Artículo en Inglés | MEDLINE | ID: mdl-38860675

RESUMEN

In recent years, the application of single-cell transcriptomics and spatial transcriptomics analysis techniques has become increasingly widespread. Whether dealing with single-cell transcriptomic or spatial transcriptomic data, dimensionality reduction and clustering are indispensable. Both single-cell and spatial transcriptomic data are often high-dimensional, making the analysis and visualization of such data challenging. Through dimensionality reduction, it becomes possible to visualize the data in a lower-dimensional space, allowing for the observation of relationships and differences between cell subpopulations. Clustering enables the grouping of similar cells into the same cluster, aiding in the identification of distinct cell subpopulations and revealing cellular diversity, providing guidance for downstream analyses. In this review, we systematically summarized the most widely recognized algorithms employed for the dimensionality reduction and clustering analysis of single-cell transcriptomic and spatial transcriptomic data. This endeavor provides valuable insights and ideas that can contribute to the development of novel tools in this rapidly evolving field.

5.

RDscan: Extracting RNA-disease relationship from the literature based on pre-training model.

Zhang, Yang; Yang, Yu; Ren, Liping; Ning, Lin; Zou, Quan; Luo, Nanchao; Zhang, Yinghui; Liu, Ruijun.

Methods ; 228: 48-54, 2024 Aug.

Artículo en Inglés | MEDLINE | ID: mdl-38789016

RESUMEN

With the rapid advancements in molecular biology and genomics, a multitude of connections between RNA and diseases has been unveiled, making the efficient and accurate extraction of RNA-disease (RD) relationships from extensive biomedical literature crucial for advancing research in this field. This study introduces RDscan, a novel text mining method developed based on the pre-training and fine-tuning strategy, aimed at automatically extracting RD-related information from a vast corpus of literature using pre-trained biomedical large language models (LLM). Initially, we constructed a dedicated RD corpus by manually curating from literature, comprising 2,082 positive and 2,000 negative sentences, alongside an independent test dataset (comprising 500 positive and 500 negative sentences) for training and evaluating RDscan. Subsequently, by fine-tuning the Bioformer and BioBERT pre-trained models, RDscan demonstrated exceptional performance in text classification and named entity recognition (NER) tasks. In 5-fold cross-validation, RDscan significantly outperformed traditional machine learning methods (Support Vector Machine, Logistic Regression and Random Forest). In addition, we have developed an accessible webserver that assists users in extracting RD relationships from text. In summary, RDscan represents the first text mining tool specifically designed for RD relationship extraction, and is poised to emerge as an invaluable tool for researchers dedicated to exploring the intricate interactions between RNA and diseases. Webserver of RDscan is free available at https://cellknowledge.com.cn/RDscan/.

Asunto(s)

Minería de Datos , ARN , Minería de Datos/métodos , ARN/genética , Humanos , Aprendizaje Automático , Enfermedad/genética , Máquina de Vectores de Soporte , Programas Informáticos

6.

DeepAVP-TPPred: identification of antiviral peptides using transformed image-based localized descriptors and binary tree growth algorithm.

Ullah, Matee; Akbar, Shahid; Raza, Ali; Zou, Quan.

Bioinformatics ; 40(5)2024 May 02.

Artículo en Inglés | MEDLINE | ID: mdl-38710482

RESUMEN

MOTIVATION: Despite the extensive manufacturing of antiviral drugs and vaccination, viral infections continue to be a major human ailment. Antiviral peptides (AVPs) have emerged as potential candidates in the pursuit of novel antiviral drugs. These peptides show vigorous antiviral activity against a diverse range of viruses by targeting different phases of the viral life cycle. Therefore, the accurate prediction of AVPs is an essential yet challenging task. Lately, many machine learning-based approaches have developed for this purpose; however, their limited capabilities in terms of feature engineering, accuracy, and generalization make these methods restricted. RESULTS: In the present study, we aim to develop an efficient machine learning-based approach for the identification of AVPs, referred to as DeepAVP-TPPred, to address the aforementioned problems. First, we extract two new transformed feature sets using our designed image-based feature extraction algorithms and integrate them with an evolutionary information-based feature. Next, these feature sets were optimized using a novel feature selection approach called binary tree growth Algorithm. Finally, the optimal feature space from the training dataset was fed to the deep neural network to build the final classification model. The proposed model DeepAVP-TPPred was tested using stringent 5-fold cross-validation and two independent dataset testing methods, which achieved the maximum performance and showed enhanced efficiency over existing predictors in terms of both accuracy and generalization capabilities. AVAILABILITY AND IMPLEMENTATION: https://github.com/MateeullahKhan/DeepAVP-TPPred.

Asunto(s)

Algoritmos , Antivirales , Aprendizaje Automático , Antivirales/farmacología , Antivirales/química , Péptidos/química , Humanos , Biología Computacional/métodos , Redes Neurales de la Computación

7.

ADAM17 variant causes hair loss via ubiquitin ligase TRIM47 mediated degradation.

Wang, Xiaoxiao; Pan, Chaolan; Zheng, Luyao; Wang, Jianbo; Zou, Quan; Sun, Peiyi; Zhou, Kaili; Zhao, Anqi; Cao, Qiaoyu; He, Wei; Wang, Yumeng; Cheng, Ruhong; Yao, Zhirong; Zhang, Si; Zhang, Hui; Li, Ming.

JCI Insight ; 2024 May 21.

Artículo en Inglés | MEDLINE | ID: mdl-38771644

RESUMEN

Hypotrichosis is a genetic disorder which characterized by a diffuse and progressive loss of scalp and/or body hair. Nonetheless, the causative genes for several affected individuals remain elusive, and the underlying mechanisms have yet to be fully elucidated. Here, we discovered a dominant variant in ADAM17 gene caused hypotrichosis with woolly hair. Adam17 (p.D647N) knock-in mice model mimicked the hair abnormality in patients. ADAM17 (p.D647N) mutation led to hair follicle stem cells (HFSCs) exhaustion and caused abnormal hair follicles, ultimately resulting in alopecia. Mechanistic studies revealed that ADAM17 binds directly to E3 ubiquitin ligase TRIM47. ADAM17 (p.D647N) variant enhanced the association between ADAM17 and TRIM47, leading to an increase in ubiquitination and subsequent degradation of ADAM17 protein. Furthermore, reduced ADAM17 protein expression affected Notch signaling pathway, impairing the activation, proliferation, and differentiation of HFSCs during hair follicle regeneration. Overexpression of NICD rescued the reduced proliferation ability caused by Adam17 variant in primary fibroblast cells.

8.

msBERT-Promoter: a multi-scale ensemble predictor based on BERT pre-trained model for the two-stage prediction of DNA promoters and their strengths.

Li, Yazi; Wei, Xiaoman; Yang, Qinglin; Xiong, An; Li, Xingfeng; Zou, Quan; Cui, Feifei; Zhang, Zilong.

BMC Biol ; 22(1): 126, 2024 May 30.

Artículo en Inglés | MEDLINE | ID: mdl-38816885

RESUMEN

BACKGROUND: A promoter is a specific sequence in DNA that has transcriptional regulatory functions, playing a role in initiating gene expression. Identifying promoters and their strengths can provide valuable information related to human diseases. In recent years, computational methods have gained prominence as an effective means for identifying promoter, offering a more efficient alternative to labor-intensive biological approaches. RESULTS: In this study, a two-stage integrated predictor called "msBERT-Promoter" is proposed for identifying promoters and predicting their strengths. The model incorporates multi-scale sequence information through a tokenization strategy and fine-tunes the DNABERT model. Soft voting is then used to fuse the multi-scale information, effectively addressing the issue of insufficient DNA sequence information extraction in traditional models. To the best of our knowledge, this is the first time an integrated approach has been used in the DNABERT model for promoter identification and strength prediction. Our model achieves accuracy rates of 96.2% for promoter identification and 79.8% for promoter strength prediction, significantly outperforming existing methods. Furthermore, through attention mechanism analysis, we demonstrate that our model can effectively combine local and global sequence information, enhancing its interpretability. CONCLUSIONS: msBERT-Promoter provides an effective tool that successfully captures sequence-related attributes of DNA promoters and can accurately identify promoters and predict their strengths. This work paves a new path for the application of artificial intelligence in traditional biology.

Asunto(s)

Regiones Promotoras Genéticas , Biología Computacional/métodos , ADN/genética , Humanos , Modelos Genéticos , Análisis de Secuencia de ADN/métodos

9.

Application and Comparison of Machine Learning and Database-Based Methods in Taxonomic Classification of High-Throughput Sequencing Data.

Tian, Qinzhong; Zhang, Pinglu; Zhai, Yixiao; Wang, Yansu; Zou, Quan.

Genome Biol Evol ; 16(5)2024 May 02.

Artículo en Inglés | MEDLINE | ID: mdl-38748485

RESUMEN

The advent of high-throughput sequencing technologies has not only revolutionized the field of bioinformatics but has also heightened the demand for efficient taxonomic classification. Despite technological advancements, efficiently processing and analyzing the deluge of sequencing data for precise taxonomic classification remains a formidable challenge. Existing classification approaches primarily fall into two categories, database-based methods and machine learning methods, each presenting its own set of challenges and advantages. On this basis, the aim of our study was to conduct a comparative analysis between these two methods while also investigating the merits of integrating multiple database-based methods. Through an in-depth comparative study, we evaluated the performance of both methodological categories in taxonomic classification by utilizing simulated data sets. Our analysis revealed that database-based methods excel in classification accuracy when backed by a rich and comprehensive reference database. Conversely, while machine learning methods show superior performance in scenarios where reference sequences are sparse or lacking, they generally show inferior performance compared with database methods under most conditions. Moreover, our study confirms that integrating multiple database-based methods does, in fact, enhance classification accuracy. These findings shed new light on the taxonomic classification of high-throughput sequencing data and bear substantial implications for the future development of computational biology. For those interested in further exploring our methods, the source code of this study is publicly available on https://github.com/LoadStar822/Genome-Classifier-Performance-Evaluator. Additionally, a dedicated webpage showcasing our collected database, data sets, and various classification software can be found at http://lab.malab.cn/~tqz/project/taxonomic/.

Asunto(s)

Secuenciación de Nucleótidos de Alto Rendimiento , Aprendizaje Automático , Bases de Datos Genéticas , Biología Computacional/métodos , Clasificación/métodos

10.

scMNMF: a novel method for single-cell multi-omics clustering based on matrix factorization.

Qiu, Yushan; Guo, Dong; Zhao, Pu; Zou, Quan.

Brief Bioinform ; 25(3)2024 Mar 27.

Artículo en Inglés | MEDLINE | ID: mdl-38754408

RESUMEN

MOTIVATION: The technology for analyzing single-cell multi-omics data has advanced rapidly and has provided comprehensive and accurate cellular information by exploring cell heterogeneity in genomics, transcriptomics, epigenomics, metabolomics and proteomics data. However, because of the high-dimensional and sparse characteristics of single-cell multi-omics data, as well as the limitations of various analysis algorithms, the clustering performance is generally poor. Matrix factorization is an unsupervised, dimensionality reduction-based method that can cluster individuals and discover related omics variables from different blocks. Here, we present a novel algorithm that performs joint dimensionality reduction learning and cell clustering analysis on single-cell multi-omics data using non-negative matrix factorization that we named scMNMF. We formulate the objective function of joint learning as a constrained optimization problem and derive the corresponding iterative formulas through alternating iterative algorithms. The major advantage of the scMNMF algorithm remains its capability to explore hidden related features among omics data. Additionally, the feature selection for dimensionality reduction and cell clustering mutually influence each other iteratively, leading to a more effective discovery of cell types. We validated the performance of the scMNMF algorithm using two simulated and five real datasets. The results show that scMNMF outperformed seven other state-of-the-art algorithms in various measurements. AVAILABILITY AND IMPLEMENTATION: scMNMF code can be found at https://github.com/yushanqiu/scMNMF.

Asunto(s)

Algoritmos , Análisis de la Célula Individual , Análisis de la Célula Individual/métodos , Análisis por Conglomerados , Humanos , Genómica/métodos , Biología Computacional/métodos , Proteómica/métodos , Metabolómica/métodos , Epigenómica/métodos , Multiómica

11.

Integrated convolution and self-attention for improving peptide toxicity prediction.

Jiao, Shihu; Ye, Xiucai; Sakurai, Tetsuya; Zou, Quan; Liu, Ruijun.

Bioinformatics ; 40(5)2024 May 02.

Artículo en Inglés | MEDLINE | ID: mdl-38696758

RESUMEN

MOTIVATION: Peptides are promising agents for the treatment of a variety of diseases due to their specificity and efficacy. However, the development of peptide-based drugs is often hindered by the potential toxicity of peptides, which poses a significant barrier to their clinical application. Traditional experimental methods for evaluating peptide toxicity are time-consuming and costly, making the development process inefficient. Therefore, there is an urgent need for computational tools specifically designed to predict peptide toxicity accurately and rapidly, facilitating the identification of safe peptide candidates for drug development. RESULTS: We provide here a novel computational approach, CAPTP, which leverages the power of convolutional and self-attention to enhance the prediction of peptide toxicity from amino acid sequences. CAPTP demonstrates outstanding performance, achieving a Matthews correlation coefficient of approximately 0.82 in both cross-validation settings and on independent test datasets. This performance surpasses that of existing state-of-the-art peptide toxicity predictors. Importantly, CAPTP maintains its robustness and generalizability even when dealing with data imbalances. Further analysis by CAPTP reveals that certain sequential patterns, particularly in the head and central regions of peptides, are crucial in determining their toxicity. This insight can significantly inform and guide the design of safer peptide drugs. AVAILABILITY AND IMPLEMENTATION: The source code for CAPTP is freely available at https://github.com/jiaoshihu/CAPTP.

Asunto(s)

Biología Computacional , Péptidos , Péptidos/química , Biología Computacional/métodos , Humanos , Secuencia de Aminoácidos , Algoritmos , Programas Informáticos

12.

Deciphering Microbial Adaptation in the Rhizosphere: Insights into Niche Preference, Functional Profiles, and Cross-Kingdom Co-occurrences.

Wang, Yansu; Zou, Quan.

Microb Ecol ; 87(1): 74, 2024 May 21.

Artículo en Inglés | MEDLINE | ID: mdl-38771320

RESUMEN

Rhizosphere microbial communities are to be as critical factors for plant growth and vitality, and their adaptive differentiation strategies have received increasing amounts of attention but are poorly understood. In this study, we obtained bacterial and fungal amplicon sequences from the rhizosphere and bulk soils of various ecosystems to investigate the potential mechanisms of microbial adaptation to the rhizosphere environment. Our focus encompasses three aspects: niche preference, functional profiles, and cross-kingdom co-occurrence patterns. Our findings revealed a correlation between niche similarity and nucleotide distance, suggesting that niche adaptation explains nucleotide variation among some closely related amplicon sequence variants (ASVs). Furthermore, biological macromolecule metabolism and communication among abundant bacteria increase in the rhizosphere conditions, suggesting that bacterial function is trait-mediated in terms of fitness in new habitats. Additionally, our analysis of cross-kingdom networks revealed that fungi act as intermediaries that facilitate connections between bacteria, indicating that microbes can modify their cooperative relationships to adapt. Overall, the evidence for rhizosphere microbial community adaptation, via differences in gene and functional and co-occurrence patterns, elucidates the adaptive benefits of genetic and functional flexibility of the rhizosphere microbiota through niche shifts.

Asunto(s)

Adaptación Fisiológica , Bacterias , Hongos , Microbiota , Rizosfera , Microbiología del Suelo , Hongos/genética , Hongos/clasificación , Hongos/fisiología , Bacterias/genética , Bacterias/clasificación , Bacterias/metabolismo , Bacterias/aislamiento & purificación , Ecosistema , Fenómenos Fisiológicos Bacterianos

13.

Fusion of multi-source relationships and topology to infer lncRNA-protein interactions.

Zhang, Xinyu; Liu, Mingzhe; Li, Zhen; Zhuo, Linlin; Fu, Xiangzheng; Zou, Quan.

Mol Ther Nucleic Acids ; 35(2): 102187, 2024 Jun 11.

Artículo en Inglés | MEDLINE | ID: mdl-38706631

RESUMEN

Long non-coding RNAs (lncRNAs) are important factors involved in biological regulatory networks. Accurately predicting lncRNA-protein interactions (LPIs) is vital for clarifying lncRNA's functions and pathogenic mechanisms. Existing deep learning models have yet to yield satisfactory results in LPI prediction. Recently, graph autoencoders (GAEs) have seen rapid development, excelling in tasks like link prediction and node classification. We employed GAE technology for LPI prediction, devising the FMSRT-LPI model based on path masking and degree regression strategies and thereby achieving satisfactory outcomes. This represents the first known integration of path masking and degree regression strategies into the GAE framework for potential LPI inference. The effectiveness of our FMSRT-LPI model primarily relies on four key aspects. First, within the GAE framework, our model integrates multi-source relationships of lncRNAs and proteins with LPN's topological data. Second, the implemented masking strategy efficiently identifies LPN's key paths, reconstructs the network, and reduces the impact of redundant or incorrect data. Third, the integrated degree decoder balances degree and structural information, enhancing node representation. Fourth, the PolyLoss function we introduced is more appropriate for LPI prediction tasks. The results on multiple public datasets further demonstrate our model's potential in LPI prediction.

14.

scTPC: a novel semisupervised deep clustering model for scRNA-seq data.

Qiu, Yushan; Yang, Lingfei; Jiang, Hao; Zou, Quan.

Bioinformatics ; 40(5)2024 May 02.

Artículo en Inglés | MEDLINE | ID: mdl-38684178

RESUMEN

MOTIVATION: Continuous advancements in single-cell RNA sequencing (scRNA-seq) technology have enabled researchers to further explore the study of cell heterogeneity, trajectory inference, identification of rare cell types, and neurology. Accurate scRNA-seq data clustering is crucial in single-cell sequencing data analysis. However, the high dimensionality, sparsity, and presence of "false" zero values in the data can pose challenges to clustering. Furthermore, current unsupervised clustering algorithms have not effectively leveraged prior biological knowledge, making cell clustering even more challenging. RESULTS: This study investigates a semisupervised clustering model called scTPC, which integrates the triplet constraint, pairwise constraint, and cross-entropy constraint based on deep learning. Specifically, the model begins by pretraining a denoising autoencoder based on a zero-inflated negative binomial distribution. Deep clustering is then performed in the learned latent feature space using triplet constraints and pairwise constraints generated from partial labeled cells. Finally, to address imbalanced cell-type datasets, a weighted cross-entropy loss is introduced to optimize the model. A series of experimental results on 10 real scRNA-seq datasets and five simulated datasets demonstrate that scTPC achieves accurate clustering with a well-designed framework. AVAILABILITY AND IMPLEMENTATION: scTPC is a Python-based algorithm, and the code is available from https://github.com/LF-Yang/Code or https://zenodo.org/records/10951780.

Asunto(s)

Algoritmos , Análisis de la Célula Individual , Análisis de la Célula Individual/métodos , Análisis por Conglomerados , Humanos , Análisis de Secuencia de ARN/métodos , RNA-Seq/métodos , Aprendizaje Profundo , Programas Informáticos , Análisis de Expresión Génica de una Sola Célula

15.

Multi-kernel Learning Fusion Algorithm Based on RNN and GRU for ASD Diagnosis and Pathogenic Brain Region Extraction.

Chen, Jie; Zhang, Huilian; Zou, Quan; Liao, Bo; Bi, Xia-An.

Interdiscip Sci ; 2024 Apr 29.

Artículo en Inglés | MEDLINE | ID: mdl-38683281

RESUMEN

Autism spectrum disorder (ASD) is a complex, severe disorder related to brain development. It impairs patient language communication and social behaviors. In recent years, ASD researches have focused on a single-modal neuroimaging data, neglecting the complementarity between multi-modal data. This omission may lead to poor classification. Therefore, it is important to study multi-modal data of ASD for revealing its pathogenesis. Furthermore, recurrent neural network (RNN) and gated recurrent unit (GRU) are effective for sequence data processing. In this paper, we introduce a novel framework for a Multi-Kernel Learning Fusion algorithm based on RNN and GRU (MKLF-RAG). The framework utilizes RNN and GRU to provide feature selection for data of different modalities. Then these features are fused by MKLF algorithm to detect the pathological mechanisms of ASD and extract the most relevant the Regions of Interest (ROIs) for the disease. The MKLF-RAG proposed in this paper has been tested in a variety of experiments with the Autism Brain Imaging Data Exchange (ABIDE) database. Experimental findings indicate that our framework notably enhances the classification accuracy for ASD. Compared with other methods, MKLF-RAG demonstrates superior efficacy across multiple evaluation metrics and could provide valuable insights into the early diagnosis of ASD.

16.

TPMA: A two pointers meta-alignment tool to ensemble different multiple nucleic acid sequence alignments.

Zhai, Yixiao; Chao, Jiannan; Wang, Yizheng; Zhang, Pinglu; Tang, Furong; Zou, Quan.

PLoS Comput Biol ; 20(4): e1011988, 2024 Apr.

Artículo en Inglés | MEDLINE | ID: mdl-38557416

RESUMEN

Accurate multiple sequence alignment (MSA) is imperative for the comprehensive analysis of biological sequences. However, a notable challenge arises as no single MSA tool consistently outperforms its counterparts across diverse datasets. Users often have to try multiple MSA tools to achieve optimal alignment results, which can be time-consuming and memory-intensive. While the overall accuracy of certain MSA results may be lower, there could be local regions with the highest alignment scores, prompting researchers to seek a tool capable of merging these locally optimal results from multiple initial alignments into a globally optimal alignment. In this study, we introduce Two Pointers Meta-Alignment (TPMA), a novel tool designed for the integration of nucleic acid sequence alignments. TPMA employs two pointers to partition the initial alignments into blocks containing identical sequence fragments. It selects blocks with the high sum of pairs (SP) scores to concatenate them into an alignment with an overall SP score superior to that of the initial alignments. Through tests on simulated and real datasets, the experimental results consistently demonstrate that TPMA outperforms M-Coffee in terms of aSP, Q, and total column (TC) scores across most datasets. Even in cases where TPMA's scores are comparable to M-Coffee, TPMA exhibits significantly lower running time and memory consumption. Furthermore, we comprehensively assessed all the MSA tools used in the experiments, considering accuracy, time, and memory consumption. We propose accurate and fast combination strategies for small and large datasets, which streamline the user tool selection process and facilitate large-scale dataset integration. The dataset and source code of TPMA are available on GitHub (https://github.com/malabz/TPMA).

Asunto(s)

Algoritmos , Ácidos Nucleicos , Alineación de Secuencia , Café , Programas Informáticos

17.

Revisiting drug-protein interaction prediction: a novel global-local perspective.

Zhou, Zhecheng; Liao, Qingquan; Wei, Jinhang; Zhuo, Linlin; Wu, Xiaonan; Fu, Xiangzheng; Zou, Quan.

Bioinformatics ; 40(5)2024 May 02.

Artículo en Inglés | MEDLINE | ID: mdl-38648052

RESUMEN

MOTIVATION: Accurate inference of potential drug-protein interactions (DPIs) aids in understanding drug mechanisms and developing novel treatments. Existing deep learning models, however, struggle with accurate node representation in DPI prediction, limiting their performance. RESULTS: We propose a new computational framework that integrates global and local features of nodes in the drug-protein bipartite graph for efficient DPI inference. Initially, we employ pre-trained models to acquire fundamental knowledge of drugs and proteins and to determine their initial features. Subsequently, the MinHash and HyperLogLog algorithms are utilized to estimate the similarity and set cardinality between drug and protein subgraphs, serving as their local features. Then, an energy-constrained diffusion mechanism is integrated into the transformer architecture, capturing interdependencies between nodes in the drug-protein bipartite graph and extracting their global features. Finally, we fuse the local and global features of nodes and employ multilayer perceptrons to predict the likelihood of potential DPIs. A comprehensive and precise node representation guarantees efficient prediction of unknown DPIs by the model. Various experiments validate the accuracy and reliability of our model, with molecular docking results revealing its capability to identify potential DPIs not present in existing databases. This approach is expected to offer valuable insights for furthering drug repurposing and personalized medicine research. AVAILABILITY AND IMPLEMENTATION: Our code and data are accessible at: https://github.com/ZZCrazy00/DPI.

Asunto(s)

Algoritmos , Simulación del Acoplamiento Molecular , Proteínas , Proteínas/química , Proteínas/metabolismo , Preparaciones Farmacéuticas/química , Preparaciones Farmacéuticas/metabolismo , Biología Computacional/métodos , Aprendizaje Profundo

18.

Electrochemically Enable N-Sulfenylation/Phosphinylation of Sulfoximines via Oxidative Dehydrocoupling Reaction.

Zhang, Wenbao; Jin, Dongsheng; Hu, Yongkang; Yin, Kun; Zou, Quan; Tang, Liang; Qian, Peng.

J Org Chem ; 89(9): 6106-6116, 2024 May 03.

Artículo en Inglés | MEDLINE | ID: mdl-38632856

RESUMEN

An electrochemical oxidative cross-coupling strategy for the synthesis of N-sulfenylsulfoximines from sulfoximines and thiols was accomplished, giving diverse N-sulfenylsulfoximines in moderate to good yields. Moreover, this strategy can be extended to construct the N-P bond of N-phosphinylated sulfoximines. With electrons as reagents, the oxidative dehydrogenation cross-coupling reaction proceeds smoothly in the absence of traditional redox reagents.

19.

Integrating Single-Cell and Spatial Transcriptomics Reveals Heterogeneity of Early Pig Skin Development and a Subpopulation with Hair Placode Formation.

Wang, Yi; Jiang, Yao; Ni, Guiyan; Li, Shujuan; Balderson, Brad; Zou, Quan; Liu, Huatao; Jiang, Yifan; Sun, Jingchun; Ding, Xiangdong.

Adv Sci (Weinh) ; 11(20): e2306703, 2024 May.

Artículo en Inglés | MEDLINE | ID: mdl-38561967

RESUMEN

The dermis and epidermis, crucial structural layers of the skin, encompass appendages, hair follicles (HFs), and intricate cellular heterogeneity. However, an integrated spatiotemporal transcriptomic atlas of embryonic skin has not yet been described and would be invaluable for studying skin-related diseases in humans. Here, single-cell and spatial transcriptomic analyses are performed on skin samples of normal and hairless fetal pigs across four developmental periods. The cross-species comparison of skin cells illustrated that the pig epidermis is more representative of the human epidermis than mice epidermis. Moreover, Phenome-wide association study analysis revealed that the conserved genes between pigs and humans are strongly associated with human skin-related diseases. In the epidermis, two lineage differentiation trajectories describe hair follicle (HF) morphogenesis and epidermal development. By comparing normal and hairless fetal pigs, it is found that the hair placode (Pc), the most characteristic initial structure in HFs, arises from progenitor-like OGN+/UCHL1+ cells. These progenitors appear earlier in development than the previously described early Pc cells and exhibit abnormal proliferation and migration during differentiation in hairless pigs. The study provides a valuable resource for in-depth insights into HF development, which may serve as a key reference atlas for studying human skin disease etiology using porcine models.

Asunto(s)

Folículo Piloso , Transcriptoma , Animales , Porcinos/genética , Porcinos/embriología , Folículo Piloso/metabolismo , Folículo Piloso/embriología , Folículo Piloso/crecimiento & desarrollo , Transcriptoma/genética , Análisis de la Célula Individual/métodos , Piel/metabolismo , Piel/embriología , Diferenciación Celular/genética , Perfilación de la Expresión Génica/métodos , Humanos , Ratones

20.

Editorial: Artificial intelligence in drug discovery and development.

Wei, Leyi; Zou, Quan; Zeng, Xiangxiang.

Methods ; 226: 133-137, 2024 Jun.

Artículo en Inglés | MEDLINE | ID: mdl-38582311

Asunto(s)

Inteligencia Artificial , Descubrimiento de Drogas , Descubrimiento de Drogas/métodos , Humanos , Desarrollo de Medicamentos/métodos , Desarrollo de Medicamentos/tendencias

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA