Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 41
Filter
Add more filters










Publication year range
1.
Bioinformatics ; 40(Supplement_1): i539-i547, 2024 Jun 28.
Article in English | MEDLINE | ID: mdl-38940179

ABSTRACT

MOTIVATION: In drug discovery, it is crucial to assess the drug-target binding affinity (DTA). Although molecular docking is widely used, computational efficiency limits its application in large-scale virtual screening. Deep learning-based methods learn virtual scoring functions from labeled datasets and can quickly predict affinity. However, there are three limitations. First, existing methods only consider the atom-bond graph or one-dimensional sequence representations of compounds, ignoring the information about functional groups (pharmacophores) with specific biological activities. Second, relying on limited labeled datasets fails to learn comprehensive embedding representations of compounds and proteins, resulting in poor generalization performance in complex scenarios. Third, existing feature fusion methods cannot adequately capture contextual interaction information. RESULTS: Therefore, we propose a novel DTA prediction method named HeteroDTA. Specifically, a multi-view compound feature extraction module is constructed to model the atom-bond graph and pharmacophore graph. The residue concat graph and protein sequence are also utilized to model protein structure and function. Moreover, to enhance the generalization capability and reduce the dependence on task-specific labeled data, pre-trained models are utilized to initialize the atomic features of the compounds and the embedding representations of the protein sequence. A context-aware nonlinear feature fusion method is also proposed to learn interaction patterns between compounds and proteins. Experimental results on public benchmark datasets show that HeteroDTA significantly outperforms existing methods. In addition, HeteroDTA shows excellent generalization performance in cold-start experiments and superiority in the representation learning ability of drug-target pairs. Finally, the effectiveness of HeteroDTA is demonstrated in a real-world drug discovery study. AVAILABILITY AND IMPLEMENTATION: The source code and data are available at https://github.com/daydayupzzl/HeteroDTA.


Subject(s)
Drug Discovery , Drug Discovery/methods , Molecular Docking Simulation , Proteins/chemistry , Proteins/metabolism , Deep Learning , Pharmacophore
3.
Front Neurosci ; 18: 1349781, 2024.
Article in English | MEDLINE | ID: mdl-38560048

ABSTRACT

Background and objectives: Glioblastoma (GBM) and brain metastasis (MET) are the two most common intracranial tumors. However, the different pathogenesis of the two tumors leads to completely different treatment options. In terms of magnetic resonance imaging (MRI), GBM and MET are extremely similar, which makes differentiation by imaging extremely challenging. Therefore, this study explores an improved deep learning algorithm to assist in the differentiation of GBM and MET. Materials and methods: For this study, axial contrast-enhanced T1 weight (ceT1W) MRI images from 321 cases of high-grade gliomas and solitary brain metastasis were collected. Among these, 251 out of 270 cases were selected for the experimental dataset (127 glioblastomas and 124 metastases), 207 cases were chosen as the training dataset, and 44 cases as the testing dataset. We designed a new deep learning algorithm called SCAT-inception (Spatial Convolutional Attention inception) and used five-fold cross-validation to verify the results. Results: By employing the newly designed SCAT-inception model to predict glioblastomas and brain metastasis, the prediction accuracy reached 92.3%, and the sensitivity and specificity reached 93.5 and 91.1%, respectively. On the external testing dataset, our model achieved an accuracy of 91.5%, which surpasses other model performances such as VGG, UNet, and GoogLeNet. Conclusion: This study demonstrated that the SCAT-inception architecture could extract more subtle features from ceT1W images, provide state-of-the-art performance in the differentiation of GBM and MET, and surpass most existing approaches.

4.
J Imaging ; 10(3)2024 Feb 23.
Article in English | MEDLINE | ID: mdl-38535137

ABSTRACT

Language bias stands as a noteworthy concern in visual question answering (VQA), wherein models tend to rely on spurious correlations between questions and answers for prediction. This prevents the models from effectively generalizing, leading to a decrease in performance. In order to address this bias, we propose a novel modality fusion collaborative de-biasing algorithm (CoD). In our approach, bias is considered as the model's neglect of information from a particular modality during prediction. We employ a collaborative training approach to facilitate mutual modeling between different modalities, achieving efficient feature fusion and enabling the model to fully leverage multimodal knowledge for prediction. Our experiments on various datasets, including VQA-CP v2, VQA v2, and VQA-VS, using different validation strategies, demonstrate the effectiveness of our approach. Notably, employing a basic baseline model resulted in an accuracy of 60.14% on VQA-CP v2.

5.
Plant Methods ; 20(1): 25, 2024 Feb 04.
Article in English | MEDLINE | ID: mdl-38311765

ABSTRACT

BACKGROUND: Mastering the spatial distribution and planting area of paddy can provide a scientific basis for monitoring rice production, and planning grain production layout. Previous remote sensing studies on paddy concentrated in the plain areas with large-sized fields, ignored the fact that rice is also widely planted in vast hilly regions. In addition, the land cover types here are diverse, rice fields are characterized by a scattered and fragmented distribution with small- or medium-sized, which pose difficulties for high-precision rice recognition. METHODS: In the paper, we proposed a solution based on Sentinel-1 SAR, Sentinel-2 MSI, DEM, and rice calendar data to focus on the rice fields identification in hilly areas. This solution mainly included the construction of rice feature dataset at four crucial phenological periods, the generation of rice standard spectral curve, and the proposal of spectral similarity algorithm for rice identification. RESULTS: The solution, integrating topographical and rice phenological characteristics, manifested its effectiveness with overall accuracy exceeding 0.85. Comparing the results with UAV, it presented that rice fields with an area exceeding 400 m2 (equivalent to 4 pixels) exhibited a recognition success rate of over 79%, which reached to 89% for fields exceeding 800 m2. CONCLUSIONS: The study illustrated that the proposed solution, integrating topographical and rice phenological characteristics, has the capability for charting various rice field sizes with fragmented and dispersed distribution. It also revealed that the synergy of Sentinel-1 SAR and Sentinel-2 MSI data significantly enhanced the recognition ability of rice paddy fields ranging from 400 m2 to 2000 m2.

6.
Nature ; 624(7991): 355-365, 2023 Dec.
Article in English | MEDLINE | ID: mdl-38092919

ABSTRACT

Single-cell analyses parse the brain's billions of neurons into thousands of 'cell-type' clusters residing in different brain structures1. Many cell types mediate their functions through targeted long-distance projections allowing interactions between specific cell types. Here we used epi-retro-seq2 to link single-cell epigenomes and cell types to long-distance projections for 33,034 neurons dissected from 32 different regions projecting to 24 different targets (225 source-to-target combinations) across the whole mouse brain. We highlight uses of these data for interrogating principles relating projection types to transcriptomics and epigenomics, and for addressing hypotheses about cell types and connections related to genetics. We provide an overall synthesis with 926 statistical comparisons of discriminability of neurons projecting to each target for every source. We integrate this dataset into the larger BRAIN Initiative Cell Census Network atlas, composed of millions of neurons, to link projection cell types to consensus clusters. Integration with spatial transcriptomics further assigns projection-enriched clusters to smaller source regions than the original dissections. We exemplify this by presenting in-depth analyses of projection neurons from the hypothalamus, thalamus, hindbrain, amygdala and midbrain to provide insights into properties of those cell types, including differentially expressed genes, their associated cis-regulatory elements and transcription-factor-binding motifs, and neurotransmitter use.


Subject(s)
Brain , Epigenomics , Neural Pathways , Neurons , Animals , Mice , Amygdala , Brain/cytology , Brain/metabolism , Consensus Sequence , Datasets as Topic , Gene Expression Profiling , Hypothalamus/cytology , Mesencephalon/cytology , Neural Pathways/cytology , Neurons/metabolism , Neurotransmitter Agents/metabolism , Regulatory Sequences, Nucleic Acid , Rhombencephalon/cytology , Single-Cell Analysis , Thalamus/cytology , Transcription Factors/metabolism
7.
Front Aging Neurosci ; 15: 1267020, 2023.
Article in English | MEDLINE | ID: mdl-38020780

ABSTRACT

Alzheimer's disease (AD) is the most common cause of dementia. Accurate prediction and diagnosis of AD and its prodromal stage, i.e., mild cognitive impairment (MCI), is essential for the possible delay and early treatment for the disease. In this paper, we adopt the data from the China Longitudinal Aging Study (CLAS), which was launched in 2011, and includes a joint effort of 15 institutions all over the country. Four thousand four hundred and eleven people who are at least 60 years old participated in the project, where 3,514 people completed the baseline survey. The survey collected data including demographic information, daily lifestyle, medical history, and routine physical examination. In particular, we employ ensemble learning and feature selection methods to develop an explainable prediction model for AD and MCI. Five feature selection methods and nine machine learning classifiers are applied for comparison to find the most dominant features on AD/MCI prediction. The resulting model achieves accuracy of 89.2%, sensitivity of 87.7%, and specificity of 90.7% for MCI prediction, and accuracy of 99.2%, sensitivity of 99.7%, and specificity of 98.7% for AD prediction. We further utilize the SHapley Additive exPlanations (SHAP) algorithm to visualize the specific contribution of each feature to AD/MCI prediction at both global and individual levels. Consequently, our model not only provides the prediction outcome, but also helps to understand the relationship between lifestyle/physical disease history and cognitive function, and enables clinicians to make appropriate recommendations for the elderly. Therefore, our approach provides a new perspective for the design of a computer-aided diagnosis system for AD and MCI, and has potential high clinical application value.

8.
Sensors (Basel) ; 23(20)2023 Oct 17.
Article in English | MEDLINE | ID: mdl-37896624

ABSTRACT

Selecting training samples is crucial in remote sensing image classification. In this paper, we selected three images-Sentinel-2, GF-1, and Landsat 8-and employed three methods for selecting training samples: grouping selection, entropy-based selection, and direct selection. We then used the selected training samples to train three supervised classification models-random forest (RF), support-vector machine (SVM), and k-nearest neighbor (KNN)-and evaluated the classification results of the three images. According to the experimental results, the three classification models performed similarly. Compared with the entropy-based method, the grouping selection method achieved higher classification accuracy using fewer samples. In addition, the grouping selection method outperformed the direct selection method with the same number of samples. Therefore, the grouping selection method performed the best. When using the grouping selection method, the image classification accuracy increased with the increase in the number of samples within a certain sample size range.

9.
Lab Chip ; 23(19): 4324-4333, 2023 09 26.
Article in English | MEDLINE | ID: mdl-37702391

ABSTRACT

Particle separation plays a critical role in many biochemical analyses. In this article, we report a method of reverse flow enhanced inertia pinched flow fractionation (RF-iPFF) for particle separation. RF-iPFF separates particles by size based on the flow-induced inertial lift, and in the abruptly broadened segment, reverse flow is utilized to further enhance the separation distance between particles of different sizes. The separation performance can be significantly improved by reverse flow. Generally, compared with the case without reverse flow, this RF-iPFF technique can increase the particle throughput by about 10 times. To demonstrate the advantages of RF-iPFF, RF-iPFF was compared with traditional iPFF through a control experiment. RF-iPFF consistently outperformed iPFF across various conditions we studied. In addition, we use tumor cells spiked into the human whole blood to evaluate the separation performance of RF-iPFF.


Subject(s)
Microfluidic Analytical Techniques , Humans , Particle Size , Microfluidic Analytical Techniques/methods , Chemical Fractionation/methods , Microspheres
10.
Article in English | MEDLINE | ID: mdl-37773916

ABSTRACT

In recent years, Graph Neural Networks (GNNs) based on deep learning techniques have achieved promising results in EEG-based depression detection tasks but still have some limitations. Firstly, most existing GNN-based methods use pre-computed graph adjacency matrices, which ignore the differences in brain networks between individuals. Additionally, methods based on graph-structured data do not consider the temporal dependency information of brain networks. To address these issues, we propose a deep learning algorithm that explores adaptive graph topologies and temporal graph networks for EEG-based depression detection. Specifically, we designed an Adaptive Graph Topology Generation (AGTG) module that can adaptively model the real-time connectivity of the brain networks, revealing differences between individuals. In addition, we designed a Graph Convolutional Gated Recurrent Unit (GCGRU) module to capture the temporal dynamical changes of brain networks. To further explore the differential features between depressed and healthy individuals, we adopt Graph Topology-based Max-Pooling (GTMP) module to extract graph representation vectors accurately. We conduct a comparative analysis with several advanced algorithms on both public and our own datasets. The results reveal that our final model achieves the highest Area Under the Receiver Operating Characteristic Curve (AUROC) on both datasets, with values of 83% and 99%, respectively. Furthermore, we perform extensive validation experiments demonstrating our proposed method's effectiveness and advantages. Finally, we present a comprehensive discussion on the differences in brain networks between healthy and depressed individuals based on the outputs of our final model's AGTG and GTMP modules.

11.
Front Neurosci ; 17: 1203698, 2023.
Article in English | MEDLINE | ID: mdl-37575298

ABSTRACT

Objective: This study aimed to investigate the reliability of a deep neural network (DNN) model trained only on contrast-enhanced T1 (T1CE) images for predicting intraoperative cerebrospinal fluid (ioCSF) leaks in endoscopic transsphenoidal surgery (EETS). Methods: 396 pituitary adenoma (PA) cases were reviewed, only primary PAs with Hardy suprasellar Stages A, B, and C were included in this study. The T1CE images of these patients were collected, and sagittal and coronal T1CE slices were selected for training the DNN model. The model performance was evaluated and tested, and its interpretability was explored. Results: A total of 102 PA cases were enrolled in this study, 51 from the ioCSF leakage group, and 51 from the non-ioCSF leakage group. 306 sagittal and 306 coronal T1CE slices were collected as the original dataset, and data augmentation was applied before model training and testing. In the test dataset, the DNN model provided a single-slice prediction accuracy of 97.29%, a sensitivity of 98.25%, and a specificity of 96.35%. In clinical test, the accuracy of the DNN model in predicting ioCSF leaks in patients reached 84.6%. The feature maps of the model were visualized and the regions of interest for prediction were the tumor roof and suprasellar region. Conclusion: In this study, the DNN model could predict ioCSF leaks based on preoperative T1CE images, especially in PAs in Hardy Stages A, B, and C. The region of interest in the model prediction-making process is similar to that of humans. DNN models trained with preoperative MRI images may provide a novel tool for predicting ioCSF leak risk for PA patients.

12.
Entropy (Basel) ; 25(3)2023 Mar 17.
Article in English | MEDLINE | ID: mdl-36981408

ABSTRACT

Recurrent Neural Networks (RNNs) are applied in safety-critical fields such as autonomous driving, aircraft collision detection, and smart credit. They are highly susceptible to input perturbations, but little research on RNN-oriented testing techniques has been conducted, leaving a threat to a large number of sequential application domains. To address these gaps, improve the test adequacy of RNNs, find more defects, and improve the performance of RNNs models and their robustness to input perturbations. We aim to propose a test coverage metric for the underlying structure of RNNs, which is used to guide the generation of test inputs to test RNNs. Although coverage metrics have been proposed for RNNs, such as the hidden state coverage in RNN-Test, they ignore the fact that the underlying structure of RNNs is still a fully connected neural network but with an additional "delayer" that records the network state at the time of data input. We use the contributions, i.e., the combination of the outputs of neurons and the weights they emit, as the minimum computational unit of RNNs to explore the finer-grained logical structure inside the recurrent cells. Compared to existing coverage metrics, our research covers the decision mechanism of RNNs in more detail and is more likely to generate more adversarial samples and discover more flaws in the model. In this paper, we redefine the contribution coverage metric applicable to Stacked LSTMs and Stacked GRUs by considering the joint effect of neurons and weights in the underlying structure of the neural network. We propose a new coverage metric, RNNCon, which can be used to guide the generation of adversarial test inputs. And we design and implement a test framework prototype RNNCon-Test. 2 datasets, 4 LSTM models, and 4 GRU models are used to verify the effectiveness of RNNCon-Test. Compared to the current state-of-the-art study RNN-Test, RNNCon can cover a deeper decision logic of RNNs. RNNCon-Test is not only effective in identifying defects in Deep Learning (DL) systems but also in improving the performance of the model if the adversarial inputs generated by RNNCon-Test are filtered and added to the training set to retrain the model. In the case where the accuracy of the model is already high, RNNCon-Test is still able to improve the accuracy of the model by up to 0.45%.

13.
IEEE J Biomed Health Inform ; 27(2): 652-663, 2023 02.
Article in English | MEDLINE | ID: mdl-35771792

ABSTRACT

Nowadays, Federated Learning (FL) over Internet of Medical Things (IoMT) devices has become a current research hotspot. As a new architecture, FL can well protect the data privacy of IoMT devices, but the security of neural network model transmission can not be guaranteed. On the other hand, the sizes of current popular neural network models are usually relatively extensive, and how to deploy them on the IoMT devices has become a challenge. One promising approach to these problems is to reduce the network scale by quantizing the parameters of the neural networks, which can greatly improve the security of data transmission and reduce the transmission cost. In the previous literature, the fixed-point quantizer with stochastic rounding has been shown to have better performance than other quantization methods. However, how to design such quantizer to achieve the minimum square quantization error is still unknown. In addition, how to apply this quantizer in the FL framework also needs investigation. To address these questions, in this paper, we propose FedMSQE - Federated Learning with Minimum Square Quantization Error, that achieves the smallest quantization error for each individual client in the FL setting. Through numerical experiments in both single-node and FL scenarios, we prove that our proposed algorithm can achieve higher accuracy and lower quantization error than other quantization methods.


Subject(s)
Internet of Things , Humans , Internet , Algorithms , Neural Networks, Computer , Privacy
17.
Comput Intell Neurosci ; 2022: 8342638, 2022.
Article in English | MEDLINE | ID: mdl-36407688

ABSTRACT

Federated learning (FL), a distributed machine-learning framework, is poised to effectively protect data privacy and security, and it also has been widely applied in variety of fields in recent years. However, the system heterogeneity and statistical heterogeneity of FL pose serious obstacles to the global model's quality. This study investigates server and client resource allocation in the context of FL system resource efficiency and offers the FedAwo optimization algorithm. This approach combines adaptive learning with federated learning, and makes full use of the computing resources of the server to calculate the optimal weight value corresponding to each client. This approach aggregated the global model according to the optimal weight value, which significantly minimizes the detrimental effects of statistical and system heterogeneity. In the process of traditional FL, we found that a large number of client trainings converge earlier than the specified epoch. However, according to the provisions of traditional FL, the client still needs to be trained for the specified epoch, which leads to the meaningless of a large number of calculations in the client. To further lower the training cost, the augmentation FedAwo ∗ algorithm is proposed. The FedAwo ∗ algorithm takes into account the heterogeneity of clients and sets the criteria for local convergence. When the local model of the client reaches the criteria, it will be returned to the server immediately. In this way, the epoch of the client can dynamically be modified adaptively. A large number of experiments based on MNIST and Fashion-MNIST public datasets reveal that the global model converges faster and has higher accuracy in FedAwo and FedAwo ∗ algorithms than FedAvg, FedProx, and FedAdp baseline algorithms.


Subject(s)
Algorithms , Machine Learning , Humans , Computers
18.
Sensors (Basel) ; 22(21)2022 Nov 03.
Article in English | MEDLINE | ID: mdl-36366172

ABSTRACT

With the development of the Internet of things (IoT), federated learning (FL) has received increasing attention as a distributed machine learning (ML) framework that does not require data exchange. However, current FL frameworks follow an idealized setup in which the task size is fixed and the storage space is unlimited, which is impossible in the real world. In fact, new classes of these participating clients always emerge over time, and some samples are overwritten or discarded due to storage limitations. We urgently need a new framework to adapt to the dynamic task sequences and strict storage constraints in the real world. Continuous learning or incremental learning is the ultimate goal of deep learning, and we introduce incremental learning into FL to describe a new federated learning framework. New generation federated learning (NGFL) is probably the most desirable framework for FL, in which, in addition to the basic task of training the server, each client needs to learn its private tasks, which arrive continuously independent of communication with the server. We give a rigorous mathematical representation of this framework, detail several major challenges faced under this framework, and address the main challenges of combining incremental learning with federated learning (aggregation of heterogeneous output layers and the task transformation mutual knowledge problem), and show the lower and upper baselines of the framework.


Subject(s)
Algorithms , Machine Learning , Humans , Computers
19.
Sci Data ; 9(1): 613, 2022 10 11.
Article in English | MEDLINE | ID: mdl-36220857

ABSTRACT

Nitrate pollution in groundwater, which is an international problem, threatens human health and the environment. It could take decades for nitrate to transport in the groundwater system. When understanding the impacts of this nitrate legacy on water quality, the nitrate transport velocity (vN) in the unsaturated zone (USZ) is of great significance. Although some local USZ vN data measured or simulated are available, there has been no such a dataset at the global scale. Here, we present a Global-scale unsaturated zone Nitrate transport Velocity dataset (GNV) generated from a Nitrate Time Bomb (NTB) model using global permeability and porosity and global average annual groundwater recharge data. To evaluate GNV, a baseline dataset of USZ vN was created using locally measured data and global lithological data. The results show that 94.50% of GNV match the baseline USZ vN dataset. This dataset will largely contribute to research advancement in the nitrate legacy in the groundwater system, provide evidence for managing nitrate water pollution, and promote international and interdisciplinary collaborations.

20.
Sensors (Basel) ; 22(15)2022 Aug 05.
Article in English | MEDLINE | ID: mdl-35957410

ABSTRACT

Machine learning combined with satellite image time series can quickly, and reliably be implemented to map crop distribution and growth monitoring necessary for food security. However, obtaining a large number of field survey samples for classifier training is often time-consuming and costly, which results in the very slow production of crop distribution maps. To overcome this challenge, we propose an ensemble learning approach from the existing historical crop data layer (CDL) to automatically create multitudes of samples according to the rules of spatiotemporal sample selection. Sentinel-2 monthly composite images from 2017 to 2019 for crop distribution mapping in Jilin Province were mosaicked and classified. Classification accuracies of four machine learning algorithms for a single-month and multi-month time series were compared. The results show that deep neural network (DNN) performed the best, followed by random forest (RF), then decision tree (DT), and support vector machine (SVM) the least. Compared with other months, July and August have higher classification accuracy, and the kappa coefficients of 0.78 and 0.79, respectively. Compared with a single phase, the kappa coefficient gradually increases with the growth of the time series, reaching 0.94 in August at the earliest, and then the increase is not obvious, and the highest in the whole growth cycle is 0.95. During the mapping process, time series of different lengths produced different classification results. Wetland types were misclassified as rice. In such cases, authors combined time series of two lengths to correct the misclassified rice types. By comparing with existing products and field points, rice has the highest consistency, followed by corn, whereas soybeans have the least consistency. This shows that the generated sample data set and trained model in this research can meet the crop mapping accuracy and simultaneously reduce the cost of field surveys. For further research, more years and types of crops should be considered for mapping and validation.


Subject(s)
Neural Networks, Computer , Support Vector Machine , Algorithms , Crops, Agricultural , Machine Learning
SELECTION OF CITATIONS
SEARCH DETAIL
...