Pesquisa | Portal Regional da BVS

1.

Deep Canonical Correlation Fusion Algorithm Based on Denoising Autoencoder for ASD Diagnosis and Pathogenic Brain Region Identification.

Zhang, Huilian; Chen, Jie; Liao, Bo; Wu, Fang-Xiang; Bi, Xia-An.

Interdiscip Sci ; 2024 Apr 04.

Artigo em Inglês | MEDLINE | ID: mdl-38573456

RESUMO

Autism Spectrum Disorder (ASD) is defined as a neurodevelopmental condition distinguished by unconventional neural activities. Early intervention is key to managing the progress of ASD, and current research primarily focuses on the use of structural magnetic resonance imaging (sMRI) or resting-state functional magnetic resonance imaging (rs-fMRI) for diagnosis. Moreover, the use of autoencoders for disease classification has not been sufficiently explored. In this study, we introduce a new framework based on autoencoder, the Deep Canonical Correlation Fusion algorithm based on Denoising Autoencoder (DCCF-DAE), which proves to be effective in handling high-dimensional data. This framework involves efficient feature extraction from different types of data with an advanced autoencoder, followed by the fusion of these features through the DCCF model. Then we utilize the fused features for disease classification. DCCF integrates functional and structural data to help accurately diagnose ASD and identify critical Regions of Interest (ROIs) in disease mechanisms. We compare the proposed framework with other methods by the Autism Brain Imaging Data Exchange (ABIDE) database and the results demonstrate its outstanding performance in ASD diagnosis. The superiority of DCCF-DAE highlights its potential as a crucial tool for early ASD diagnosis and monitoring.

2.

pathMap: a path-based mapping tool for long noisy reads with high sensitivity.

Wei, Ze-Gang; Zhang, Xiao-Dan; Fan, Xing-Guo; Qian, Yu; Liu, Fei; Wu, Fang-Xiang.

Brief Bioinform ; 25(2)2024 Jan 22.

Artigo em Inglês | MEDLINE | ID: mdl-38517696

RESUMO

With the rapid development of single-molecule sequencing (SMS) technologies, the output read length is continuously increasing. Mapping such reads onto a reference genome is one of the most fundamental tasks in sequence analysis. Mapping sensitivity is becoming a major concern since high sensitivity can detect more aligned regions on the reference and obtain more aligned bases, which are useful for downstream analysis. In this study, we present pathMap, a novel k-mer graph-based mapper that is specifically designed for mapping SMS reads with high sensitivity. By viewing the alignment chain as a path containing as many anchors as possible in the matched k-mer graph, pathMap treats chaining as a path selection problem in the directed graph. pathMap iteratively searches the longest path in the remaining nodes; more candidate chains with high quality can be effectively detected and aligned. Compared to other state-of-the-art mapping methods such as minimap2 and Winnowmap2, experiment results on simulated and real-life datasets demonstrate that pathMap obtains the number of mapped chains at least 11.50% more than its closest competitor and increases the mapping sensitivity by 17.28% and 13.84% of bases over the next-best mapper for Pacific Biosciences and Oxford Nanopore sequencing data, respectively. In addition, pathMap is more robust to sequence errors and more sensitive to species- and strain-specific identification of pathogens using MinION reads.

Assuntos

Sequenciamento de Nucleotídeos em Larga Escala , Sequenciamento por Nanoporos , Análise de Sequência de DNA/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Genoma , Software , Algoritmos

3.

Deep integrated fusion of local and global features for cervical cell classification.

Fang, Ming; Fu, Minghan; Liao, Bo; Lei, Xiujuan; Wu, Fang-Xiang.

Comput Biol Med ; 171: 108153, 2024 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-38364660

RESUMO

Cervical cytology image classification is of great significance to the cervical cancer diagnosis and prognosis. Recently, convolutional neural network (CNN) and visual transformer have been adopted as two branches to learn the features for image classification by simply adding local and global features. However, such the simple addition may not be effective to integrate these features. In this study, we explore the synergy of local and global features for cytology images for classification tasks. Specifically, we design a Deep Integrated Feature Fusion (DIFF) block to synergize local and global features of cytology images from a CNN branch and a transformer branch. Our proposed method is evaluated on three cervical cell image datasets (SIPaKMeD, CRIC, Herlev) and another large blood cell dataset BCCD for several multi-class and binary classification tasks. Experimental results demonstrate the effectiveness of the proposed method in cervical cell classification, which could assist medical specialists to better diagnose cervical cancer.

Assuntos

Neoplasias do Colo do Útero , Feminino , Humanos , Aprendizagem , Redes Neurais de Computação , Processamento de Imagem Assistida por Computador

4.

OIF-Net: An Optical Flow Registration-Based PET/MR Cross-Modal Interactive Fusion Network for Low-Count Brain PET Image Denoising.

Fu, Minghan; Zhang, Na; Huang, Zhenxing; Zhou, Chao; Zhang, Xu; Yuan, Jianmin; He, Qiang; Yang, Yongfeng; Zheng, Hairong; Liang, Dong; Wu, Fang-Xiang; Fan, Wei; Hu, Zhanli.

IEEE Trans Med Imaging ; 43(4): 1554-1567, 2024 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-38096101

RESUMO

The short frames of low-count positron emission tomography (PET) images generally cause high levels of statistical noise. Thus, improving the quality of low-count images by using image postprocessing algorithms to achieve better clinical diagnoses has attracted widespread attention in the medical imaging community. Most existing deep learning-based low-count PET image enhancement methods have achieved satisfying results, however, few of them focus on denoising low-count PET images with the magnetic resonance (MR) image modality as guidance. The prior context features contained in MR images can provide abundant and complementary information for single low-count PET image denoising, especially in ultralow-count (2.5%) cases. To this end, we propose a novel two-stream dual PET/MR cross-modal interactive fusion network with an optical flow pre-alignment module, namely, OIF-Net. Specifically, the learnable optical flow registration module enables the spatial manipulation of MR imaging inputs within the network without any extra training supervision. Registered MR images fundamentally solve the problem of feature misalignment in the multimodal fusion stage, which greatly benefits the subsequent denoising process. In addition, we design a spatial-channel feature enhancement module (SC-FEM) that considers the interactive impacts of multiple modalities and provides additional information flexibility in both the spatial and channel dimensions. Furthermore, instead of simply concatenating two extracted features from these two modalities as an intermediate fusion method, the proposed cross-modal feature fusion module (CM-FFM) adopts cross-attention at multiple feature levels and greatly improves the two modalities' feature fusion procedure. Extensive experimental assessments conducted on real clinical datasets, as well as an independent clinical testing dataset, demonstrate that the proposed OIF-Net outperforms the state-of-the-art methods.

Assuntos

Processamento de Imagem Assistida por Computador , Fluxo Óptico , Processamento de Imagem Assistida por Computador/métodos , Tomografia por Emissão de Pósitrons/métodos , Imageamento por Ressonância Magnética/métodos , Encéfalo/diagnóstico por imagem

5.

invMap: a sensitive mapping tool for long noisy reads with inversion structural variants.

Wei, Ze-Gang; Bu, Peng-Yu; Zhang, Xiao-Dan; Liu, Fei; Qian, Yu; Wu, Fang-Xiang.

Bioinformatics ; 39(12)2023 12 01.

Artigo em Inglês | MEDLINE | ID: mdl-38058196

RESUMO

MOTIVATION: Longer reads produced by PacBio or Oxford Nanopore sequencers could more frequently span the breakpoints of structural variations (SVs) than shorter reads. Therefore, existing long-read mapping methods often generate wrong alignments and variant calls. Compared to deletions and insertions, inversion events are more difficult to be detected since the anchors in inversion regions are nonlinear to those in SV-free regions. To address this issue, this study presents a novel long-read mapping algorithm (named as invMap). RESULTS: For each long noisy read, invMap first locates the aligned region with a specifically designed scoring method for chaining, then checks the remaining anchors in the aligned region to discover potential inversions. We benchmark invMap on simulated datasets across different genomes and sequencing coverages, experimental results demonstrate that invMap is more accurate to locate aligned regions and call SVs for inversions than the competing methods. The real human genome sequencing dataset of NA12878 illustrates that invMap can effectively find more candidate variant calls for inversions than the competing methods. AVAILABILITY AND IMPLEMENTATION: The invMap software is available at https://github.com/zhang134/invMap.git.

Assuntos

Genômica , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Software , Algoritmos , Genoma Humano , Inversão Cromossômica , Análise de Sequência de DNA/métodos

6.

Sparse2Noise: Low-dose synchrotron X-ray tomography without high-quality reference data.

Duan, Xiaoman; Ding, Xiao Fan; Li, Naitao; Wu, Fang-Xiang; Chen, Xiongbiao; Zhu, Ning.

Comput Biol Med ; 165: 107473, 2023 10.

Artigo em Inglês | MEDLINE | ID: mdl-37690288

RESUMO

BACKGROUND: Synchrotron radiation computed tomography (SR-CT) holds promise for high-resolution in vivo imaging. Notably, the reconstruction of SR-CT images necessitates a large set of data to be captured with sufficient photons from multiple angles, resulting in high radiation dose received by the object. Reducing the number of projections and/or photon flux is a straightforward means to lessen the radiation dose, however, compromises data completeness, thus introducing noises and artifacts. Deep learning (DL)-based supervised methods effectively denoise and remove artifacts, but they heavily depend on high-quality paired data acquired at high doses. Although algorithms exist for training without high-quality references, they struggle to effectively eliminate persistent artifacts present in real-world data. METHODS: This work presents a novel low-dose imaging strategy namely Sparse2Noise, which combines the reconstruction data from paired sparse-view CT scan (normal-flux) and full-view CT scan (low-flux) using a convolutional neural network (CNN). Sparse2Noise does not require high-quality reconstructed data as references and allows for fresh training on data with very small size. Sparse2Noise was evaluated by both simulated and experimental data. RESULTS: Sparse2Noise effectively reduces noise and ring artifacts while maintaining high image quality, outperforming state-of-the-art image denoising methods at same dose levels. Furthermore, Sparse2Noise produces impressive high image quality for ex vivo rat hindlimb imaging with the acceptable low radiation dose (i.e., 0.5 Gy with the isotropic voxel size of 26 µm). CONCLUSIONS: This work represents a significant advance towards in vivo SR-CT imaging. It is noteworthy that Sparse2Noise can also be used for denoising in conventional CT and/or phase-contrast CT.

Assuntos

Síncrotrons , Tomografia Computadorizada por Raios X , Animais , Ratos , Fótons , Algoritmos , Artefatos

7.

A posterior probability based Bayesian method for single-cell RNA-seq data imputation.

Chen, Siqi; Zheng, Ruiqing; Tian, Luyi; Wu, Fang-Xiang; Li, Min.

Methods ; 216: 21-38, 2023 08.

Artigo em Inglês | MEDLINE | ID: mdl-37315825

RESUMO

Single-cell RNA-sequencing (scRNA-seq) data suffer from a lot of zeros. Such dropout events impede the downstream data analyses. We propose BayesImpute to infer and impute dropouts from the scRNA-seq data. Using the expression rate and coefficient of variation of the genes within the cell subpopulation, BayesImpute first determines likely dropouts, and then constructs the posterior distribution for each gene and uses the posterior mean to impute dropout values. Some simulated and real experiments show that BayesImpute can effectively identify dropout events and reduce the introduction of false positive signals. Additionally, BayesImpute successfully recovers the true expression levels of missing values, restores the gene-to-gene and cell-to-cell correlation coefficient, and maintains the biological information in bulk RNA-seq data. Furthermore, BayesImpute boosts the clustering and visualization of cell subpopulations and improves the identification of differentially expressed genes. We further demonstrate that, in comparison to other statistical-based imputation methods, BayesImpute is scalable and fast with minimal memory usage.

Assuntos

Análise da Expressão Gênica de Célula Única , Software , Análise de Sequência de RNA/métodos , Teorema de Bayes , Análise de Célula Única/métodos , Probabilidade , Perfilação da Expressão Gênica

8.

PreOBP_ML: Machine Learning Algorithms for Prediction of Optical Biosensor Parameters.

Ahmed, Kawsar; Bui, Francis M; Wu, Fang-Xiang.

Micromachines (Basel) ; 14(6)2023 May 31.

Artigo em Inglês | MEDLINE | ID: mdl-37374757

RESUMO

To develop standard optical biosensors, the simulation procedure takes a lot of time. For reducing that enormous amount of time and effort, machine learning might be a better solution. Effective indices, core power, total power, and effective area are the most crucial parameters for evaluating optical sensors. In this study, several machine learning (ML) approaches have been applied to predict those parameters while considering the core radius, cladding radius, pitch, analyte, and wavelength as the input vectors. We have utilized least squares (LS), LASSO, Elastic-Net (ENet), and Bayesian ridge regression (BRR) to make a comparative discussion using a balanced dataset obtained with the COMSOL Multiphysics simulation tool. Furthermore, a more extensive analysis of sensitivity, power fraction, and confinement loss is also demonstrated using the predicted and simulated data. The suggested models were also examined in terms of R2-score, mean average error (MAE), and mean squared error (MSE), with all of the models having an R2-score of more than 0.99, and it was also shown that optical biosensors had a design error rate of less than 3%. This research might pave the way for machine learning-based optimization approaches to be used to improve optical biosensors.

9.

A multi-modal deep neural network for multi-class liver cancer diagnosis.

Khan, Rayyan Azam; Fu, Minghan; Burbridge, Brent; Luo, Yigang; Wu, Fang-Xiang.

Neural Netw ; 165: 553-561, 2023 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-37354807

RESUMO

Liver disease is a potentially asymptomatic clinical entity that may progress to patient death. This study proposes a multi-modal deep neural network for multi-class malignant liver diagnosis. In parallel with the portal venous computed tomography (CT) scans, pathology data is utilized to prognosticate primary liver cancer variants and metastasis. The processed CT scans are fed to the deep dilated convolution neural network to explore salient features. The residual connections are further added to address vanishing gradient problems. Correspondingly, five pathological features are learned using a wide and deep network that gives a benefit of memorization with generalization. The down-scaled hierarchical features from CT scan and pathology data are concatenated to pass through fully connected layers for classification between liver cancer variants. In addition, the transfer learning of pre-trained deep dilated convolution layers assists in handling insufficient and imbalanced dataset issues. The fine-tuned network can predict three-class liver cancer variants with an average accuracy of 96.06% and an Area Under Curve (AUC) of 0.832. To the best of our knowledge, this is the first study to classify liver cancer variants by integrating pathology and image data, hence following the medical perspective of malignant liver diagnosis. The comparative analysis on the benchmark dataset shows that the proposed multi-modal neural network outperformed most of the liver diagnostic studies and is comparable to others.

Assuntos

Aprendizado Profundo , Neoplasias Hepáticas , Humanos , Redes Neurais de Computação , Neoplasias Hepáticas/diagnóstico por imagem , Diagnóstico por Computador/métodos

10.

A Two-Branch Neural Network for Short-Axis PET Image Quality Enhancement.

Fu, Minghan; Wang, Meiyun; Wu, Yaping; Zhang, Na; Yang, Yongfeng; Wang, Haining; Zhou, Yun; Shang, Yue; Wu, Fang-Xiang; Zheng, Hairong; Liang, Dong; Hu, Zhanli.

IEEE J Biomed Health Inform ; 27(6): 2864-2875, 2023 06.

Artigo em Inglês | MEDLINE | ID: mdl-37030746

RESUMO

The axial field of view (FOV) is a key factor that affects the quality of PET images. Due to hardware FOV restrictions, conventional short-axis PET scanners with FOVs of 20 to 35 cm can acquire only low-quality PET (LQ-PET) images in fast scanning times (2-3 minutes). To overcome hardware restrictions and improve PET image quality for better clinical diagnoses, several deep learning-based algorithms have been proposed. However, these approaches use simple convolution layers with residual learning and local attention, which insufficiently extract and fuse long-range contextual information. To this end, we propose a novel two-branch network architecture with swin transformer units and graph convolution operation, namely SW-GCN. The proposed SW-GCN provides additional spatial- and channel-wise flexibility to handle different types of input information flow. Specifically, considering the high computational cost of calculating self-attention weights in full-size PET images, in our designed spatial adaptive branch, we take the self-attention mechanism within each local partition window and introduce global information interactions between nonoverlapping windows by shifting operations to prevent the aforementioned problem. In addition, the convolutional network structure considers the information in each channel equally during the feature extraction process. In our designed channel adaptive branch, we use a Watts Strogatz topology structure to connect each feature map to only its most relevant features in each graph convolutional layer, substantially reducing information redundancy. Moreover, ensemble learning is adopted in our SW-GCN for mapping distinct features from the two well-designed branches to the enhanced PET images. We carried out extensive experiments on three single-bed position scans for 386 patients. The test results demonstrate that our proposed SW-GCN approach outperforms state-of-the-art methods in both quantitative and qualitative evaluations.

Assuntos

Algoritmos , Redes Neurais de Computação , Humanos , Fontes de Energia Elétrica , Tomografia por Emissão de Pósitrons

11.

Biomarker Identification via a Factorization Machine-Based Neural Network With Binary Pairwise Encoding.

Ding, Yulian; Lei, Xiujuan; Liao, Bo; Wu, Fang-Xiang.

IEEE/ACM Trans Comput Biol Bioinform ; 20(3): 2136-2146, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-37018561

RESUMO

Biomolecules, microRNAs (miRNAs) and long non-coding RNAs (lncRNAs), play critical roles in diverse fundamental and vital biological processes. They can serve as disease biomarkers as their dysregulations could cause complex human diseases. Identifying those biomarkers is helpful with the diagnosis, treatment, prognosis, and prevention of diseases. In this study, we propose a factorization machine-based deep neural network with binary pairwise encoding, DFMbpe, to identify the disease-related biomarkers. First, to comprehensively consider the interdependence of features, a binary pairwise encoding method is designed to obtain the raw feature representations for each biomarker-disease pair. Second, the raw features are mapped into their corresponding embedding vectors. Then, the factorization machine is conducted to get the wide low-order feature interdependence, while the deep neural network is applied to obtain the deep high-order feature interdependence. Finally, two kinds of features are combined to get the final prediction results. Unlike other biomarker identification models, the binary pairwise encoding considers the interdependence of features even though they never appear in the same sample, and the DFMbpe architecture emphasizes both low-order and high-order feature interactions simultaneously. The experimental results show that DFMbpe greatly outperforms the state-of-the-art identification models on both cross-validation and independent dataset evaluation. Besides, three types of case studies further demonstrate the effectiveness of this model.

Assuntos

MicroRNAs , RNA Longo não Codificante , Humanos , Redes Neurais de Computação , Biologia Computacional/métodos

12.

NMTF-DTI: A Nonnegative Matrix Tri-factorization Approach With Multiple Kernel Fusion for Drug-Target Interaction Prediction.

Jamali, Ali Akbar; Kusalik, Anthony; Wu, Fang-Xiang.

IEEE/ACM Trans Comput Biol Bioinform ; 20(1): 586-594, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-34914594

RESUMO

Prediction of drug-target interactions (DTIs) plays a significant role in drug development and drug discovery. Although this task requires a large investment in terms of time and cost, especially when it is performed experimentally, the results are not necessarily significant. Computational DTI prediction is a shortcut to reduce the risks of experimental methods. In this study, we propose an effective approach of nonnegative matrix tri-factorization, referred to as NMTF-DTI, to predict the interaction scores between drugs and targets. NMTF-DTI utilizes multiple kernels (similarity measures) for drugs and targets and Laplacian regularization to boost the prediction performance. The performance of NMTF-DTI is evaluated via cross-validation and is compared with existing DTI prediction methods in terms of the area under the receiver operating characteristic (ROC) curve (AUC) and the area under the precision and recall curve (AUPR). We evaluate our method on four gold standard datasets, comparing to other state-of-the-art methods. Cross-validation and a separate, manually created dataset are used to set parameters. The results show that NMTF-DTI outperforms other competing methods. Moreover, the results of a case study also confirm the superiority of NMTF-DTI.

Assuntos

Algoritmos , Desenvolvimento de Medicamentos , Descoberta de Drogas/métodos , Interações Medicamentosas , Curva ROC

13.

DeepCellEss: cell line-specific essential protein prediction with attention-based interpretable deep learning.

Li, Yiming; Zeng, Min; Zhang, Fuhao; Wu, Fang-Xiang; Li, Min.

Bioinformatics ; 39(1)2023 01 01.

Artigo em Inglês | MEDLINE | ID: mdl-36458923

RESUMO

MOTIVATION: Protein essentiality is usually accepted to be a conditional trait and strongly affected by cellular environments. However, existing computational methods often do not take such characteristics into account, preferring to incorporate all available data and train a general model for all cell lines. In addition, the lack of model interpretability limits further exploration and analysis of essential protein predictions. RESULTS: In this study, we proposed DeepCellEss, a sequence-based interpretable deep learning framework for cell line-specific essential protein predictions. DeepCellEss utilizes a convolutional neural network and bidirectional long short-term memory to learn short- and long-range latent information from protein sequences. Further, a multi-head self-attention mechanism is used to provide residue-level model interpretability. For model construction, we collected extremely large-scale benchmark datasets across 323 cell lines. Extensive computational experiments demonstrate that DeepCellEss yields effective prediction performance for different cell lines and outperforms existing sequence-based methods as well as network-based centrality measures. Finally, we conducted some case studies to illustrate the necessity of considering specific cell lines and the superiority of DeepCellEss. We believe that DeepCellEss can serve as a useful tool for predicting essential proteins across different cell lines. AVAILABILITY AND IMPLEMENTATION: The DeepCellEss web server is available at http://csuligroup.com:8000/DeepCellEss. The source code and data underlying this study can be obtained from https://github.com/CSUBioGroup/DeepCellEss. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Aprendizado Profundo , Proteínas/metabolismo , Sequência de Aminoácidos , Software , Linhagem Celular , Biologia Computacional/métodos

14.

NTD-DR: Nonnegative tensor decomposition for drug repositioning.

Jamali, Ali Akbar; Tan, Yuting; Kusalik, Anthony; Wu, Fang-Xiang.

PLoS One ; 17(7): e0270852, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-35862409

RESUMO

Computational drug repositioning aims to identify potential applications of existing drugs for the treatment of diseases for which they were not designed. This approach can considerably accelerate the traditional drug discovery process by decreasing the required time and costs of drug development. Tensor decomposition enables us to integrate multiple drug- and disease-related data to boost the performance of prediction. In this study, a nonnegative tensor decomposition for drug repositioning, NTD-DR, is proposed. In order to capture the hidden information in drug-target, drug-disease, and target-disease networks, NTD-DR uses these pairwise associations to construct a three-dimensional tensor representing drug-target-disease triplet associations and integrates them with similarity information of drugs, targets, and disease to make a prediction. We compare NTD-DR with recent state-of-the-art methods in terms of the area under the receiver operating characteristic (ROC) curve (AUC) and the area under the precision and recall curve (AUPR) and find that our method outperforms competing methods. Moreover, case studies with five diseases also confirm the reliability of predictions made by NTD-DR. Our proposed method identifies more known associations among the top 50 predictions than other methods. In addition, novel associations identified by NTD-DR are validated by literature analyses.

Assuntos

Biologia Computacional , Reposicionamento de Medicamentos , Algoritmos , Biologia Computacional/métodos , Descoberta de Drogas/métodos , Reposicionamento de Medicamentos/métodos , Curva ROC , Reprodutibilidade dos Testes

15.

Drug Repositioning with GraphSAGE and Clustering Constraints Based on Drug and Disease Networks.

Zhang, Yuchen; Lei, Xiujuan; Pan, Yi; Wu, Fang-Xiang.

Front Pharmacol ; 13: 872785, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-35620297

RESUMO

The understanding of therapeutic properties is important in drug repositioning and drug discovery. However, chemical or clinical trials are expensive and inefficient to characterize the therapeutic properties of drugs. Recently, artificial intelligence (AI)-assisted algorithms have received extensive attention for discovering the potential therapeutic properties of drugs and speeding up drug development. In this study, we propose a new method based on GraphSAGE and clustering constraints (DRGCC) to investigate the potential therapeutic properties of drugs for drug repositioning. First, the drug structure features and disease symptom features are extracted. Second, the drug-drug interaction network and disease similarity network are constructed according to the drug-gene and disease-gene relationships. Matrix factorization is adopted to extract the clustering features of networks. Then, all the features are fed to the GraphSAGE to predict new associations between existing drugs and diseases. Benchmark comparisons on two different datasets show that our method has reliable predictive performance and outperforms other six competing. We have also conducted case studies on existing drugs and diseases and aimed to predict drugs that may be effective for the novel coronavirus disease 2019 (COVID-19). Among the predicted anti-COVID-19 drug candidates, some drugs are being clinically studied by pharmacologists, and their binding sites to COVID-19-related protein receptors have been found via the molecular docking technology.

16.

Chinese clinical named entity recognition via multi-head self-attention based BiLSTM-CRF.

An, Ying; Xia, Xianyun; Chen, Xianlai; Wu, Fang-Xiang; Wang, Jianxin.

Artif Intell Med ; 127: 102282, 2022 05.

Artigo em Inglês | MEDLINE | ID: mdl-35430042

RESUMO

Clinical named entity recognition (CNER) is a fundamental step for many clinical Natural Language Processing (NLP) systems, which aims to recognize and classify clinical entities such as diseases, symptoms, exams, body parts and treatments in clinical free texts. In recent years, with the development of deep learning technology, deep neural networks (DNNs) have been widely used in Chinese clinical named entity recognition and many other clinical NLP tasks. However, these state-of-the-art models failed to make full use of the global information and multi-level semantic features in clinical texts. We design an improved character-level representation approach which integrates the character embedding and the character-label embedding to enhance the specificity and diversity of feature representations. Then, a multi-head self-attention based Bi-directional Long Short-Term Memory Conditional Random Field (MUSA-BiLSTM-CRF) model is proposed. By introducing the multi-head self-attention and combining a medical dictionary, the model can more effectively capture the weight relationships between characters and multi-level semantic feature information, which is expected to greatly improve the performance of Chinese clinical named entity recognition. We evaluate our model on two CCKS challenge (CCKS2017 Task 2 and CCKS2018 Task 1) benchmark datasets and the experimental results show that our proposed model achieves the best performance competing with the state-of-the-art DNN based methods.

Assuntos

Registros Eletrônicos de Saúde , Processamento de Linguagem Natural , China , Idioma , Redes Neurais de Computação

17.

HyMM: hybrid method for disease-gene prediction by integrating multiscale module structure.

Xiang, Ju; Meng, Xiangmao; Zhao, Yichao; Wu, Fang-Xiang; Li, Min.

Brief Bioinform ; 23(3)2022 05 13.

Artigo em Inglês | MEDLINE | ID: mdl-35275996

RESUMO

MOTIVATION: Identifying disease-related genes is an important issue in computational biology. Module structure widely exists in biomolecule networks, and complex diseases are usually thought to be caused by perturbations of local neighborhoods in the networks, which can provide useful insights for the study of disease-related genes. However, the mining and effective utilization of the module structure is still challenging in such issues as a disease gene prediction. RESULTS: We propose a hybrid disease-gene prediction method integrating multiscale module structure (HyMM), which can utilize multiscale information from local to global structure to more effectively predict disease-related genes. HyMM extracts module partitions from local to global scales by multiscale modularity optimization with exponential sampling, and estimates the disease relatedness of genes in partitions by the abundance of disease-related genes within modules. Then, a probabilistic model for integration of gene rankings is designed in order to integrate multiple predictions derived from multiscale module partitions and network propagation, and a parameter estimation strategy based on functional information is proposed to further enhance HyMM's predictive power. By a series of experiments, we reveal the importance of module partitions at different scales, and verify the stable and good performance of HyMM compared with eight other state-of-the-arts and its further performance improvement derived from the parameter estimation. CONCLUSIONS: The results confirm that HyMM is an effective framework for integrating multiscale module structure to enhance the ability to predict disease-related genes, which may provide useful insights for the study of the multiscale module structure and its application in such issues as a disease-gene prediction.

Assuntos

Algoritmos , Biologia Computacional , Biologia Computacional/métodos , Modelos Estatísticos , Proteínas

18.

MLRDFM: a multi-view Laplacian regularized DeepFM model for predicting miRNA-disease associations.

Ding, Yulian; Lei, Xiujuan; Liao, Bo; Wu, Fang-Xiang.

Brief Bioinform ; 23(3)2022 05 13.

Artigo em Inglês | MEDLINE | ID: mdl-35323901

RESUMO

MOTIVATION: MicroRNAs (miRNAs), as critical regulators, are involved in various fundamental and vital biological processes, and their abnormalities are closely related to human diseases. Predicting disease-related miRNAs is beneficial to uncovering new biomarkers for the prevention, detection, prognosis, diagnosis and treatment of complex diseases. RESULTS: In this study, we propose a multi-view Laplacian regularized deep factorization machine (DeepFM) model, MLRDFM, to predict novel miRNA-disease associations while improving the standard DeepFM. Specifically, MLRDFM improves DeepFM from two aspects: first, MLRDFM takes the relationships among items into consideration by regularizing their embedding features via their similarity-based Laplacians. In this study, miRNA Laplacian regularization integrates four types of miRNA similarity, while disease Laplacian regularization integrates two types of disease similarity. Second, to judiciously train our model, Laplacian eigenmaps are utilized to initialize the weights in the dense embedding layer. The experimental results on the latest HMDD v3.2 dataset show that MLRDFM improves the performance and reduces the overfitting phenomenon of DeepFM. Besides, MLRDFM is greatly superior to the state-of-the-art models in miRNA-disease association prediction in terms of different evaluation metrics with the 5-fold cross-validation. Furthermore, case studies further demonstrate the effectiveness of MLRDFM.

Assuntos

MicroRNAs , Algoritmos , Biologia Computacional/métodos , Predisposição Genética para Doença , Humanos , MicroRNAs/genética

19.

Biomedical data, computational methods and tools for evaluating disease-disease associations.

Xiang, Ju; Zhang, Jiashuai; Zhao, Yichao; Wu, Fang-Xiang; Li, Min.

Brief Bioinform ; 23(2)2022 03 10.

Artigo em Inglês | MEDLINE | ID: mdl-35136949

RESUMO

In recent decades, exploring potential relationships between diseases has been an active research field. With the rapid accumulation of disease-related biomedical data, a lot of computational methods and tools/platforms have been developed to reveal intrinsic relationship between diseases, which can provide useful insights to the study of complex diseases, e.g. understanding molecular mechanisms of diseases and discovering new treatment of diseases. Human complex diseases involve both external phenotypic abnormalities and complex internal molecular mechanisms in organisms. Computational methods with different types of biomedical data from phenotype to genotype can evaluate disease-disease associations at different levels, providing a comprehensive perspective for understanding diseases. In this review, available biomedical data and databases for evaluating disease-disease associations are first summarized. Then, existing computational methods for disease-disease associations are reviewed and classified into five groups in terms of the usages of biomedical data, including disease semantic-based, phenotype-based, function-based, representation learning-based and text mining-based methods. Further, we summarize software tools/platforms for computation and analysis of disease-disease associations. Finally, we give a discussion and summary on the research of disease-disease associations. This review provides a systematic overview for current disease association research, which could promote the development and applications of computational methods and tools/platforms for disease-disease associations.

Assuntos

Biologia Computacional , Mineração de Dados , Biologia Computacional/métodos , Mineração de Dados/métodos , Bases de Dados Factuais , Fenótipo , Software

20.

PDMDA: predicting deep-level miRNA-disease associations with graph neural networks and sequence features.

Yan, Cheng; Duan, Guihua; Li, Na; Zhang, Lishen; Wu, Fang-Xiang; Wang, Jianxin.

Bioinformatics ; 38(8): 2226-2234, 2022 04 12.

Artigo em Inglês | MEDLINE | ID: mdl-35150255

RESUMO

MOTIVATION: Many studies have shown that microRNAs (miRNAs) play a key role in human diseases. Meanwhile, traditional experimental methods for miRNA-disease association identification are extremely costly, time-consuming and challenging. Therefore, many computational methods have been developed to predict potential associations between miRNAs and diseases. However, those methods mainly predict the existence of miRNA-disease associations, and they cannot predict the deep-level miRNA-disease association types. RESULTS: In this study, we propose a new end-to-end deep learning method (called PDMDA) to predict deep-level miRNA-disease associations with graph neural networks (GNNs) and miRNA sequence features. Based on the sequence and structural features of miRNAs, PDMDA extracts the miRNA feature representations by a fully connected network (FCN). The disease feature representations are extracted from the disease-gene network and gene-gene interaction network by GNN model. Finally, a multilayer with three fully connected layers and a softmax layer is designed to predict the final miRNA-disease association scores based on the concatenated feature representations of miRNAs and diseases. Note that PDMDA does not take the miRNA-disease association matrix as input to compute the Gaussian interaction profile similarity. We conduct three experiments based on six association type samples (including circulations, epigenetics, target, genetics, known association of which their types are unknown and unknown association samples). We conduct fivefold cross-validation validation to assess the prediction performance of PDMDA. The area under the receiver operating characteristic curve scores is used as metric. The experiment results show that PDMDA can accurately predict the deep-level miRNA-disease associations. AVAILABILITY AND IMPLEMENTATION: Data and source codes are available at https://github.com/27167199/PDMDA.

Assuntos

MicroRNAs , Humanos , MicroRNAs/genética , Algoritmos , Biologia Computacional/métodos , Redes Neurais de Computação , Software

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA