Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 3.204
Filtrar
Mais filtros

Tipo de documento
Intervalo de ano de publicação
1.
Development ; 150(22)2023 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-37830145

RESUMO

Recent work shows that the developmental potential of progenitor cells in the HH10 chick brain changes rapidly, accompanied by subtle changes in morphology. This demands increased temporal resolution for studies of the brain at this stage, necessitating precise and unbiased staging. Here, we investigated whether we could train a deep convolutional neural network to sub-stage HH10 chick brains using a small dataset of 151 expertly labelled images. By augmenting our images with biologically informed transformations and data-driven preprocessing steps, we successfully trained a classifier to sub-stage HH10 brains to 87.1% test accuracy. To determine whether our classifier could be generally applied, we re-trained it using images (269) of randomised control and experimental chick wings, and obtained similarly high test accuracy (86.1%). Saliency analyses revealed that biologically relevant features are used for classification. Our strategy enables training of image classifiers for various applications in developmental biology with limited microscopy data.


Assuntos
Aprendizado Profundo , Animais , Redes Neurais de Computação , Encéfalo , Microscopia , Asas de Animais
2.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38704671

RESUMO

Computational analysis of fluorescent timelapse microscopy images at the single-cell level is a powerful approach to study cellular changes that dictate important cell fate decisions. Core to this approach is the need to generate reliable cell segmentations and classifications necessary for accurate quantitative analysis. Deep learning-based convolutional neural networks (CNNs) have emerged as a promising solution to these challenges. However, current CNNs are prone to produce noisy cell segmentations and classifications, which is a significant barrier to constructing accurate single-cell lineages. To address this, we developed a novel algorithm called Single Cell Track (SC-Track), which employs a hierarchical probabilistic cache cascade model based on biological observations of cell division and movement dynamics. Our results show that SC-Track performs better than a panel of publicly available cell trackers on a diverse set of cell segmentation types. This cell-tracking performance was achieved without any parameter adjustments, making SC-Track an excellent generalized algorithm that can maintain robust cell-tracking performance in varying cell segmentation qualities, cell morphological appearances and imaging conditions. Furthermore, SC-Track is equipped with a cell class correction function to improve the accuracy of cell classifications in multiclass cell segmentation time series. These features together make SC-Track a robust cell-tracking algorithm that works well with noisy cell instance segmentation and classification predictions from CNNs to generate accurate single-cell lineages and classifications.


Assuntos
Algoritmos , Linhagem da Célula , Rastreamento de Células , Análise de Célula Única , Rastreamento de Células/métodos , Análise de Célula Única/métodos , Humanos , Processamento de Imagem Assistida por Computador/métodos , Redes Neurais de Computação , Aprendizado Profundo , Microscopia de Fluorescência/métodos
3.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38711371

RESUMO

T-cell receptor (TCR) recognition of antigens is fundamental to the adaptive immune response. With the expansion of experimental techniques, a substantial database of matched TCR-antigen pairs has emerged, presenting opportunities for computational prediction models. However, accurately forecasting the binding affinities of unseen antigen-TCR pairs remains a major challenge. Here, we present convolutional-self-attention TCR (CATCR), a novel framework tailored to enhance the prediction of epitope and TCR interactions. Our approach utilizes convolutional neural networks to extract peptide features from residue contact matrices, as generated by OpenFold, and a transformer to encode segment-based coded sequences. We introduce CATCR-D, a discriminator that can assess binding by analyzing the structural and sequence features of epitopes and CDR3-ß regions. Additionally, the framework comprises CATCR-G, a generative module designed for CDR3-ß sequences, which applies the pretrained encoder to deduce epitope characteristics and a transformer decoder for predicting matching CDR3-ß sequences. CATCR-D achieved an AUROC of 0.89 on previously unseen epitope-TCR pairs and outperformed four benchmark models by a margin of 17.4%. CATCR-G has demonstrated high precision, recall and F1 scores, surpassing 95% in bidirectional encoder representations from transformers score assessments. Our results indicate that CATCR is an effective tool for predicting unseen epitope-TCR interactions. Incorporating structural insights enhances our understanding of the general rules governing TCR-epitope recognition significantly. The ability to predict TCRs for novel epitopes using structural and sequence information is promising, and broadening the repository of experimental TCR-epitope data could further improve the precision of epitope-TCR binding predictions.


Assuntos
Receptores de Antígenos de Linfócitos T , Receptores de Antígenos de Linfócitos T/química , Receptores de Antígenos de Linfócitos T/imunologia , Receptores de Antígenos de Linfócitos T/metabolismo , Receptores de Antígenos de Linfócitos T/genética , Humanos , Epitopos/química , Epitopos/imunologia , Biologia Computacional/métodos , Redes Neurais de Computação , Epitopos de Linfócito T/imunologia , Epitopos de Linfócito T/química , Antígenos/química , Antígenos/imunologia , Sequência de Aminoácidos
4.
Genet Epidemiol ; 2024 Sep 30.
Artigo em Inglês | MEDLINE | ID: mdl-39344923

RESUMO

Transcriptome-wide association studies (TWAS) aim to uncover genotype-phenotype relationships through a two-stage procedure: predicting gene expression from genotypes using an expression quantitative trait locus (eQTL) data set, then testing the predicted expression for trait associations. Accurate gene expression prediction in stage 1 is crucial, as it directly impacts the power to identify associations in stage 2. Currently, the first stage of such studies is primarily conducted using linear models like elastic net regression, which fail to capture the nonlinear relationships inherent in biological systems. Deep learning methods have the potential to model such nonlinear effects, but have yet to demonstrably outperform linear methods at this task. To address this gap, we propose a new deep learning architecture to predict gene expression from genotypic variation across individuals. Our method utilizes a learnable input scaling layer in conjunction with a convolutional encoder to capture nonlinear effects and higher-order interactions without compromising on interpretability. We further augment this approach to allow for parameter sharing across multiple networks, enabling us to utilize prior information for individual variants in the form of functional annotations. Evaluations on real-world genomic data show that our method consistently outperforms elastic net regression across a large set of heritable genes. Furthermore, our model statistically significantly improved predictive performance by leveraging functional annotations, whereas elastic net regression failed to show equivalent gains when using the same information, suggesting that our method can capture nonlinear functional information beyond the capability of linear models.

5.
Plant J ; 120(1): 174-186, 2024 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-39133828

RESUMO

Deep learning offers new approaches to investigate the mechanisms underlying complex biological phenomena, such as subgenome dominance. Subgenome dominance refers to the dominant expression and/or biased fractionation of genes in one subgenome of allopolyploids, which has shaped the evolution of a large group of plants. However, the underlying cause of subgenome dominance remains elusive. Here, we adopt deep learning to construct two convolutional neural network (CNN) models, binary expression model (BEM) and homoeolog contrast model (HCM), to investigate the mechanism underlying subgenome dominance using DNA sequence and methylation sites. We apply these CNN models to analyze three representative polyploidization systems, Brassica, Gossypium, and Cucurbitaceae, each with available ancient and neo/synthetic polyploidized genomes. The BEM shows that DNA sequence of the promoter region can accurately predict whether a gene is expressed or not. More importantly, the HCM shows that the DNA sequence of the promoter region predicts dominant expression status between homoeologous gene pairs retained from ancient polyploidizations, thus predicting subgenome dominance associated with these events. However, HCM fails to predict gene expression dominance between new homoeologous gene pairs arising from the neo/synthetic polyploidizations. These results are consistent across the three plant polyploidization systems, indicating broad applicability of our models. Furthermore, the two models based on methylation sites produce similar results. These results show that subgenome dominance is associated with long-term sequence differentiation between the promoters of homoeologs, suggesting that subgenome expression dominance precedes and is the driving force or even the determining factor for sequence divergence between subgenomes following polyploidization.


Assuntos
Aprendizado Profundo , Genoma de Planta , Poliploidia , Genoma de Planta/genética , Metilação de DNA , Regiões Promotoras Genéticas/genética , Evolução Molecular , Redes Neurais de Computação , Regulação da Expressão Gênica de Plantas
6.
Brief Bioinform ; 24(3)2023 05 19.
Artigo em Inglês | MEDLINE | ID: mdl-36929841

RESUMO

Single-cell omics data are growing at an unprecedented rate, whereas effective integration of them remains challenging due to different sequencing methods, quality, and expression pattern of each omics data. In this study, we propose a universal framework for the integration of single-cell multi-omics data based on graph convolutional network (GCN-SC). Among the multiple single-cell data, GCN-SC usually selects one data with the largest number of cells as the reference and the rest as the query dataset. It utilizes mutual nearest neighbor algorithm to identify cell-pairs, which provide connections between cells both within and across the reference and query datasets. A GCN algorithm further takes the mixed graph constructed from these cell-pairs to adjust count matrices from the query datasets. Finally, dimension reduction is performed by using non-negative matrix factorization before visualization. By applying GCN-SC on six datasets, we show that GCN-SC can effectively integrate sequencing data from multiple single-cell sequencing technologies, species or different omics, which outperforms the state-of-the-art methods, including Seurat, LIGER, GLUER and Pamona.


Assuntos
Algoritmos , Multiômica , Análise por Conglomerados
7.
Brief Bioinform ; 24(4)2023 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-37328692

RESUMO

Protein complexes are key functional units in cellular processes. High-throughput techniques, such as co-fractionation coupled with mass spectrometry (CF-MS), have advanced protein complex studies by enabling global interactome inference. However, dealing with complex fractionation characteristics to define true interactions is not a simple task, since CF-MS is prone to false positives due to the co-elution of non-interacting proteins by chance. Several computational methods have been designed to analyze CF-MS data and construct probabilistic protein-protein interaction (PPI) networks. Current methods usually first infer PPIs based on handcrafted CF-MS features, and then use clustering algorithms to form potential protein complexes. While powerful, these methods suffer from the potential bias of handcrafted features and severely imbalanced data distribution. However, the handcrafted features based on domain knowledge might introduce bias, and current methods also tend to overfit due to the severely imbalanced PPI data. To address these issues, we present a balanced end-to-end learning architecture, Software for Prediction of Interactome with Feature-extraction Free Elution Data (SPIFFED), to integrate feature representation from raw CF-MS data and interactome prediction by convolutional neural network. SPIFFED outperforms the state-of-the-art methods in predicting PPIs under the conventional imbalanced training. When trained with balanced data, SPIFFED had greatly improved sensitivity for true PPIs. Moreover, the ensemble SPIFFED model provides different voting schemes to integrate predicted PPIs from multiple CF-MS data. Using the clustering software (i.e. ClusterONE), SPIFFED allows users to infer high-confidence protein complexes depending on the CF-MS experimental designs. The source code of SPIFFED is freely available at: https://github.com/bio-it-station/SPIFFED.


Assuntos
Mapeamento de Interação de Proteínas , Proteínas , Mapeamento de Interação de Proteínas/métodos , Proteínas/química , Algoritmos , Mapas de Interação de Proteínas , Software
8.
Brief Bioinform ; 24(4)2023 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-37328639

RESUMO

Precise targeting of transcription factor binding sites (TFBSs) is essential to comprehending transcriptional regulatory processes and investigating cellular function. Although several deep learning algorithms have been created to predict TFBSs, the models' intrinsic mechanisms and prediction results are difficult to explain. There is still room for improvement in prediction performance. We present DeepSTF, a unique deep-learning architecture for predicting TFBSs by integrating DNA sequence and shape profiles. We use the improved transformer encoder structure for the first time in the TFBSs prediction approach. DeepSTF extracts DNA higher-order sequence features using stacked convolutional neural networks (CNNs), whereas rich DNA shape profiles are extracted by combining improved transformer encoder structure and bidirectional long short-term memory (Bi-LSTM), and, finally, the derived higher-order sequence features and representative shape profiles are integrated into the channel dimension to achieve accurate TFBSs prediction. Experiments on 165 ENCODE chromatin immunoprecipitation sequencing (ChIP-seq) datasets show that DeepSTF considerably outperforms several state-of-the-art algorithms in predicting TFBSs, and we explain the usefulness of the transformer encoder structure and the combined strategy using sequence features and shape profiles in capturing multiple dependencies and learning essential features. In addition, this paper examines the significance of DNA shape features predicting TFBSs. The source code of DeepSTF is available at https://github.com/YuBinLab-QUST/DeepSTF/.


Assuntos
DNA , Redes Neurais de Computação , Sítios de Ligação , Ligação Proteica , DNA/genética , DNA/química , Fatores de Transcrição/genética , Fatores de Transcrição/química
9.
Nano Lett ; 24(19): 5862-5869, 2024 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-38709809

RESUMO

Dynamic vision perception and processing (DVPP) is in high demand by booming edge artificial intelligence. However, existing imaging systems suffer from low efficiency or low compatibility with advanced machine vision techniques. Here, we propose a reconfigurable bipolar image sensor (RBIS) for in-sensor DVPP based on a two-dimensional WSe2/GeSe heterostructure device. Owing to the gate-tunable and reversible built-in electric field, its photoresponse shows bipolarity as being positive or negative. High-efficiency DVPP incorporating front-end RBIS and back-end CNN is then demonstrated. It shows a high recognition accuracy of over 94.9% on the derived DVS128 data set and requires much fewer neural network parameters than that without RBIS. Moreover, we demonstrate an optimized device with a vertically stacked structure and a stable nonvolatile bipolarity, which enables more efficient DVPP hardware. Our work demonstrates the potential of fabricating DVPP devices with a simple structure, high efficiency, and outputs compatible with advanced algorithms.

10.
BMC Bioinformatics ; 25(1): 106, 2024 Mar 10.
Artigo em Inglês | MEDLINE | ID: mdl-38461247

RESUMO

BACKGROUND: Predicting protein-protein interactions (PPIs) from sequence data is a key challenge in computational biology. While various computational methods have been proposed, the utilization of sequence embeddings from protein language models, which contain diverse information, including structural, evolutionary, and functional aspects, has not been fully exploited. Additionally, there is a significant need for a comprehensive neural network capable of efficiently extracting these multifaceted representations. RESULTS: Addressing this gap, we propose xCAPT5, a novel hybrid classifier that uniquely leverages the T5-XL-UniRef50 protein large language model for generating rich amino acid embeddings from protein sequences. The core of xCAPT5 is a multi-kernel deep convolutional siamese neural network, which effectively captures intricate interaction features at both micro and macro levels, integrated with the XGBoost algorithm, enhancing PPIs classification performance. By concatenating max and average pooling features in a depth-wise manner, xCAPT5 effectively learns crucial features with low computational cost. CONCLUSION: This study represents one of the initial efforts to extract informative amino acid embeddings from a large protein language model using a deep and wide convolutional network. Experimental results show that xCAPT5 outperforms recent state-of-the-art methods in binary PPI prediction, excelling in cross-validation on several benchmark datasets and demonstrating robust generalization across intra-species, cross-species, inter-species, and stringent similarity contexts.


Assuntos
Redes Neurais de Computação , Proteínas , Proteínas/química , Algoritmos , Sequência de Aminoácidos , Aminoácidos
11.
BMC Bioinformatics ; 25(1): 10, 2024 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-38177981

RESUMO

Examining potential drug-target interactions (DTIs) is a pivotal component of drug discovery and repurposing. Recently, there has been a significant rise in the use of computational techniques to predict DTIs. Nevertheless, previous investigations have predominantly concentrated on assessing either the connections between nodes or the consistency of the network's topological structure in isolation. Such one-sided approaches could severely hinder the accuracy of DTI predictions. In this study, we propose a novel method called TTGCN, which combines heterogeneous graph convolutional neural networks (GCN) and graph attention networks (GAT) to address the task of DTI prediction. TTGCN employs a two-tiered feature learning strategy, utilizing GAT and residual GCN (R-GCN) to extract drug and target embeddings from the diverse network, respectively. These drug and target embeddings are then fused through a mean-pooling layer. Finally, we employ an inductive matrix completion technique to forecast DTIs while preserving the network's node connectivity and topological structure. Our approach demonstrates superior performance in terms of area under the curve and area under the precision-recall curve in experimental comparisons, highlighting its significant advantages in predicting DTIs. Furthermore, case studies provide additional evidence of its ability to identify potential DTIs.


Assuntos
Descoberta de Drogas , Aprendizagem , Interações Medicamentosas , Redes Neurais de Computação
12.
BMC Bioinformatics ; 25(1): 159, 2024 Apr 20.
Artigo em Inglês | MEDLINE | ID: mdl-38643080

RESUMO

BACKGROUND: MicroRNAs play a critical role in regulating gene expression by binding to specific target sites within gene transcripts, making the identification of microRNA targets a prominent focus of research. Conventional experimental methods for identifying microRNA targets are both time-consuming and expensive, prompting the development of computational tools for target prediction. However, the existing computational tools exhibit limited performance in meeting the demands of practical applications, highlighting the need to improve the performance of microRNA target prediction models. RESULTS: In this paper, we utilize the most popular natural language processing and computer vision technologies to propose a novel approach, called TEC-miTarget, for microRNA target prediction based on transformer encoder and convolutional neural networks. TEC-miTarget treats RNA sequences as a natural language and encodes them using a transformer encoder, a widely used encoder in natural language processing. It then combines the representations of a pair of microRNA and its candidate target site sequences into a contact map, which is a three-dimensional array similar to a multi-channel image. Therefore, the contact map's features are extracted using a four-layer convolutional neural network, enabling the prediction of interactions between microRNA and its candidate target sites. We applied a series of comparative experiments to demonstrate that TEC-miTarget significantly improves microRNA target prediction, compared with existing state-of-the-art models. Our approach is the first approach to perform comparisons with other approaches at both sequence and transcript levels. Furthermore, it is the first approach compared with both deep learning-based and seed-match-based methods. We first compared TEC-miTarget's performance with approaches at the sequence level, and our approach delivers substantial improvements in performance using the same datasets and evaluation metrics. Moreover, we utilized TEC-miTarget to predict microRNA targets in long mRNA sequences, which involves two steps: selecting candidate target site sequences and applying sequence-level predictions. We finally showed that TEC-miTarget outperforms other approaches at the transcript level, including the popular seed match methods widely used in previous years. CONCLUSIONS: We propose a novel approach for predicting microRNA targets at both sequence and transcript levels, and demonstrate that our approach outperforms other methods based on deep learning or seed match. We also provide our approach as an easy-to-use software, TEC-miTarget, at https://github.com/tingpeng17/TEC-miTarget . Our results provide new perspectives for microRNA target prediction.


Assuntos
Aprendizado Profundo , MicroRNAs , MicroRNAs/genética , MicroRNAs/metabolismo , Redes Neurais de Computação , Software , RNA Mensageiro/genética
13.
BMC Bioinformatics ; 25(1): 306, 2024 Sep 20.
Artigo em Inglês | MEDLINE | ID: mdl-39304807

RESUMO

BACKGROUND: Locating small molecule binding sites in target proteins, in the resolution of either pocket or residue, is critical in many drug-discovery scenarios. Since it is not always easy to find such binding sites using conventional methods, different deep learning methods to predict binding sites out of protein structures have been developed in recent years. The existing deep learning based methods have several limitations, including (1) the inefficiency of the CNN-only architecture, (2) loss of information due to excessive post-processing, and (3) the under-utilization of available data sources. METHODS: We present a new model architecture and training method that resolves the aforementioned problems. First, by layering geometric self-attention units on top of residue-level 3D CNN outputs, our model overcomes the problems of CNN-only architectures. Second, by configuring the fundamental units of computation as residues and pockets instead of voxels, our method reduced the information loss from post-processing. Lastly, by employing inter-resolution transfer learning and homology-based augmentation, our method maximizes the utilization of available data sources to a significant extent. RESULTS: The proposed method significantly outperformed all state-of-the-art baselines regarding both resolutions-pocket and residue. An ablation study demonstrated the indispensability of our proposed architecture, as well as transfer learning and homology-based augmentation, for achieving optimal performance. We further scrutinized our model's performance through a case study involving human serum albumin, which demonstrated our model's superior capability in identifying multiple binding sites of the protein, outperforming the existing methods. CONCLUSIONS: We believe that our contribution to the literature is twofold. Firstly, we introduce a novel computational method for binding site prediction with practical applications, substantiated by its strong performance across diverse benchmarks and case studies. Secondly, the innovative aspects in our method- specifically, the design of the model architecture, inter-resolution transfer learning, and homology-based augmentation-would serve as useful components for future work.


Assuntos
Proteínas , Sítios de Ligação , Proteínas/química , Proteínas/metabolismo , Aprendizado Profundo , Biologia Computacional/métodos , Ligação Proteica , Humanos , Bases de Dados de Proteínas
14.
BMC Genomics ; 25(1): 47, 2024 Jan 10.
Artigo em Inglês | MEDLINE | ID: mdl-38200437

RESUMO

BACKGROUND: Essential genes encode functions that play a vital role in the life activities of organisms, encompassing growth, development, immune system functioning, and cell structure maintenance. Conventional experimental techniques for identifying essential genes are resource-intensive and time-consuming, and the accuracy of current machine learning models needs further enhancement. Therefore, it is crucial to develop a robust computational model to accurately predict essential genes. RESULTS: In this study, we introduce GCNN-SFM, a computational model for identifying essential genes in organisms, based on graph convolutional neural networks (GCNN). GCNN-SFM integrates a graph convolutional layer, a convolutional layer, and a fully connected layer to model and extract features from gene sequences of essential genes. Initially, the gene sequence is transformed into a feature map using coding techniques. Subsequently, a multi-layer GCN is employed to perform graph convolution operations, effectively capturing both local and global features of the gene sequence. Further feature extraction is performed, followed by integrating convolution and fully-connected layers to generate prediction results for essential genes. The gradient descent algorithm is utilized to iteratively update the cross-entropy loss function, thereby enhancing the accuracy of the prediction results. Meanwhile, model parameters are tuned to determine the optimal parameter combination that yields the best prediction performance during training. CONCLUSIONS: Experimental evaluation demonstrates that GCNN-SFM surpasses various advanced essential gene prediction models and achieves an average accuracy of 94.53%. This study presents a novel and effective approach for identifying essential genes, which has significant implications for biology and genomics research.


Assuntos
Genes Essenciais , Redes Neurais de Computação , Algoritmos , Entropia , Genômica
15.
Neuroimage ; 294: 120646, 2024 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-38750907

RESUMO

Deep learning can be used effectively to predict participants' age from brain magnetic resonance imaging (MRI) data, and a growing body of evidence suggests that the difference between predicted and chronological age-referred to as brain-predicted age difference (brain-PAD)-is related to various neurological and neuropsychiatric disease states. A crucial aspect of the applicability of brain-PAD as a biomarker of individual brain health is whether and how brain-predicted age is affected by MR image artifacts commonly encountered in clinical settings. To investigate this issue, we trained and validated two different 3D convolutional neural network architectures (CNNs) from scratch and tested the models on a separate dataset consisting of motion-free and motion-corrupted T1-weighted MRI scans from the same participants, the quality of which were rated by neuroradiologists from a clinical diagnostic point of view. Our results revealed a systematic increase in brain-PAD with worsening image quality for both models. This effect was also observed for images that were deemed usable from a clinical perspective, with brains appearing older in medium than in good quality images. These findings were also supported by significant associations found between the brain-PAD and standard image quality metrics indicating larger brain-PAD for lower-quality images. Our results demonstrate a spurious effect of advanced brain aging as a result of head motion and underline the importance of controlling for image quality when using brain-predicted age based on structural neuroimaging data as a proxy measure for brain health.


Assuntos
Encéfalo , Aprendizado Profundo , Imageamento por Ressonância Magnética , Redes Neurais de Computação , Humanos , Encéfalo/diagnóstico por imagem , Imageamento por Ressonância Magnética/métodos , Adulto , Masculino , Feminino , Pessoa de Meia-Idade , Adulto Jovem , Envelhecimento/fisiologia , Idoso , Movimentos da Cabeça/fisiologia , Artefatos , Processamento de Imagem Assistida por Computador/métodos , Adolescente
16.
J Cell Sci ; 135(3)2022 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-35022745

RESUMO

Immunofluorescence microscopy is routinely used to visualise the spatial distribution of proteins that dictates their cellular function. However, unspecific antibody binding often results in high cytosolic background signals, decreasing the image contrast of a target structure. Recently, convolutional neural networks (CNNs) were successfully employed for image restoration in immunofluorescence microscopy, but current methods cannot correct for those background signals. We report a new method that trains a CNN to reduce unspecific signals in immunofluorescence images; we name this method label2label (L2L). In L2L, a CNN is trained with image pairs of two non-identical labels that target the same cellular structure. We show that after L2L training a network predicts images with significantly increased contrast of a target structure, which is further improved after implementing a multiscale structural similarity loss function. Here, our results suggest that sample differences in the training data decrease hallucination effects that are observed with other methods. We further assess the performance of a cycle generative adversarial network, and show that a CNN can be trained to separate structures in superposed immunofluorescence images of two targets.


Assuntos
Processamento de Imagem Assistida por Computador , Redes Neurais de Computação , Estruturas Celulares , Processamento de Imagem Assistida por Computador/métodos , Microscopia de Fluorescência
17.
BMC Plant Biol ; 24(1): 136, 2024 Feb 26.
Artigo em Inglês | MEDLINE | ID: mdl-38408925

RESUMO

Subsistence farmers and global food security depend on sufficient food production, which aligns with the UN's "Zero Hunger," "Climate Action," and "Responsible Consumption and Production" sustainable development goals. In addition to already available methods for early disease detection and classification facing overfitting and fine feature extraction complexities during the training process, how early signs of green attacks can be identified or classified remains uncertain. Most pests and disease symptoms are seen in plant leaves and fruits, yet their diagnosis by experts in the laboratory is expensive, tedious, labor-intensive, and time-consuming. Notably, how plant pests and diseases can be appropriately detected and timely prevented is a hotspot paradigm in smart, sustainable agriculture remains unknown. In recent years, deep transfer learning has demonstrated tremendous advances in the recognition accuracy of object detection and image classification systems since these frameworks utilize previously acquired knowledge to solve similar problems more effectively and quickly. Therefore, in this research, we introduce two plant disease detection (PDDNet) models of early fusion (AE) and the lead voting ensemble (LVE) integrated with nine pre-trained convolutional neural networks (CNNs) and fine-tuned by deep feature extraction for efficient plant disease identification and classification. The experiments were carried out on 15 classes of the popular PlantVillage dataset, which has 54,305 image samples of different plant disease species in 38 categories. Hyperparameter fine-tuning was done with popular pre-trained models, including DenseNet201, ResNet101, ResNet50, GoogleNet, AlexNet, ResNet18, EfficientNetB7, NASNetMobile, and ConvNeXtSmall. We test these CNNs on the stated plant disease detection and classification problem, both independently and as part of an ensemble. In the final phase, a logistic regression (LR) classifier is utilized to determine the performance of various CNN model combinations. A comparative analysis was also performed on classifiers, deep learning, the proposed model, and similar state-of-the-art studies. The experiments demonstrated that PDDNet-AE and PDDNet-LVE achieved 96.74% and 97.79%, respectively, compared to current CNNs when tested on several plant diseases, depicting its exceptional robustness and generalization capabilities and mitigating current concerns in plant disease detection and classification.


Assuntos
Redes Neurais de Computação , Doenças das Plantas , Frutas , Aprendizado de Máquina
18.
J Comput Chem ; 2024 Sep 02.
Artigo em Inglês | MEDLINE | ID: mdl-39223071

RESUMO

Predicting protein-ligand binding affinity is a crucial and challenging task in structure-based drug discovery. With the accumulation of complex structures and binding affinity data, various machine-learning scoring functions, particularly those based on deep learning, have been developed for this task, exhibiting superiority over their traditional counterparts. A fusion model sequentially connecting a graph neural network (GNN) and a convolutional neural network (CNN) to predict protein-ligand binding affinity is proposed in this work. In this model, the intermediate outputs of the GNN layers, as supplementary descriptors of atomic chemical environments at different levels, are concatenated with the input features of CNN. The model demonstrates a noticeable improvement in performance on CASF-2016 benchmark compared to its constituent CNN models. The generalization ability of the model is evaluated by setting a series of thresholds for ligand extended-connectivity fingerprint similarity or protein sequence similarity between the training and test sets. Masking experiment reveals that model can capture key interaction regions. Furthermore, the fusion model is applied to a virtual screening task for a novel target, PI5P4Kα. The fusion strategy significantly improves the ability of the constituent CNN model to identify active compounds. This work offers a novel approach to enhancing the accuracy of deep learning models in predicting binding affinity through fusion strategies.

19.
Brief Bioinform ; 23(3)2022 05 13.
Artigo em Inglês | MEDLINE | ID: mdl-35511057

RESUMO

Host-pathogen protein interactions (HPPIs) play vital roles in many biological processes and are directly involved in infectious diseases. With the outbreak of more frequent pandemics in the last couple of decades, such as the recent outburst of Covid-19 causing millions of deaths, it has become more critical to develop advanced methods to accurately predict pathogen interactions with their respective hosts. During the last decade, experimental methods to identify HPIs have been used to decipher host-pathogen systems with the caveat that those techniques are labor-intensive, expensive and time-consuming. Alternatively, accurate prediction of HPIs can be performed by the use of data-driven machine learning. To provide a more robust and accurate solution for the HPI prediction problem, we have developed a deepHPI tool based on deep learning. The web server delivers four host-pathogen model types: plant-pathogen, human-bacteria, human-virus and animal-pathogen, leveraging its operability to a wide range of analyses and cases of use. The deepHPI web tool is the first to use convolutional neural network models for HPI prediction. These models have been selected based on a comprehensive evaluation of protein features and neural network architectures. The best prediction models have been tested on independent validation datasets, which achieved an overall Matthews correlation coefficient value of 0.87 for animal-pathogen using the combined pseudo-amino acid composition and conjoint triad (PAAC_CT) features, 0.75 for human-bacteria using the combined pseudo-amino acid composition, conjoint triad and normalized Moreau-Broto feature (PAAC_CT_NMBroto), 0.96 for human-virus using PAAC_CT_NMBroto and 0.94 values for plant-pathogen interactions using the combined pseudo-amino acid composition, composition and transition feature (PAAC_CTDC_CTDT). Our server running deepHPI is deployed on a high-performance computing cluster that enables large and multiple user requests, and it provides more information about interactions discovered. It presents an enriched visualization of the resulting host-pathogen networks that is augmented with external links to various protein annotation resources. We believe that the deepHPI web server will be very useful to researchers, particularly those working on infectious diseases. Additionally, many novel and known host-pathogen systems can be further investigated to significantly advance our understanding of complex disease-causing agents. The developed models are established on a web server, which is freely accessible at http://bioinfo.usu.edu/deepHPI/.


Assuntos
COVID-19 , Doenças Transmissíveis , Aprendizado Profundo , Aminoácidos , Animais , Interações Hospedeiro-Patógeno , Aprendizado de Máquina
20.
Brief Bioinform ; 23(5)2022 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-35849101

RESUMO

The rapid development of spatial transcriptomics allows the measurement of RNA abundance at a high spatial resolution, making it possible to simultaneously profile gene expression, spatial locations of cells or spots, and the corresponding hematoxylin and eosin-stained histology images. It turns promising to predict gene expression from histology images that are relatively easy and cheap to obtain. For this purpose, several methods are devised, but they have not fully captured the internal relations of the 2D vision features or spatial dependency between spots. Here, we developed Hist2ST, a deep learning-based model to predict RNA-seq expression from histology images. Around each sequenced spot, the corresponding histology image is cropped into an image patch and fed into a convolutional module to extract 2D vision features. Meanwhile, the spatial relations with the whole image and neighbored patches are captured through Transformer and graph neural network modules, respectively. These learned features are then used to predict the gene expression by following the zero-inflated negative binomial distribution. To alleviate the impact by the small spatial transcriptomics data, a self-distillation mechanism is employed for efficient learning of the model. By comprehensive tests on cancer and normal datasets, Hist2ST was shown to outperform existing methods in terms of both gene expression prediction and spatial region identification. Further pathway analyses indicated that our model could reserve biological information. Thus, Hist2ST enables generating spatial transcriptomics data from histology images for elucidating molecular signatures of tissues.


Assuntos
Processamento de Imagem Assistida por Computador , Transcriptoma , Amarelo de Eosina-(YS) , Hematoxilina , Processamento de Imagem Assistida por Computador/métodos , Redes Neurais de Computação , RNA
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA