Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 70
Filter
Add more filters











Publication year range
1.
Neural Netw ; 181: 106763, 2024 Oct 02.
Article in English | MEDLINE | ID: mdl-39378603

ABSTRACT

Unlike traditional supervised classification, complementary label learning (CLL) operates under a weak supervision framework, where each sample is annotated by excluding several incorrect labels, known as complementary labels (CLs). Despite reducing the labeling burden, CLL always suffers a decline in performance due to the weakened supervised information. To overcome such limitations, in this study, a multi-view fusion and self-adaptive label discovery based CLL method (MVSLDCLL) is proposed. The self-adaptive label discovery strategy leverages graph-based semi-supervised learning to capture the label distribution of each training sample as a convex combination of all its potential labels. The multi-view fusion module is designed to adapt to various views of feature representations. In specific, it minimizes the discrepancies of label projections between pairwise views, aligning with the consensus principle. Additionally, a straightforward mechanism inspired by a teamwork analogy is proposed to incorporate view-discrepancy for each sample. Experimental results demonstrate that MVSLDCLL learns more discriminative label distribution and achieves significantly higher accuracies compared to state-of-the-art CLL methods. Ablation study has also been performed to validate the effectiveness of both the self-adaptive label discovery strategy and the multi-view fusion module.

2.
Neural Netw ; 180: 106748, 2024 Sep 21.
Article in English | MEDLINE | ID: mdl-39332211

ABSTRACT

Amidst advancements in feature extraction techniques, research on multi-view multi-label classifications has attracted widespread interest in recent years. However, real-world scenarios often pose a challenge where the completeness of multiple views and labels cannot be ensured. At present, only a handful of techniques have attempted to address the complex issue of partial multi-view incomplete multi-label classification, and the majority of these approaches overlook the significance of manifold structures between instances. To tackle these challenges, we propose a novel partial multi-view incomplete multi-label learning model, termed MSLPP. Differing from existing studies, MSLPP emphasizes retaining the effective inherent structure of data during the feature extraction process, thereby facilitating a richer semantic information extraction. Specifically, MSLPP captures and integrates four types of information: the distance and similarity information in the original feature space, and the distance and similarity information in the extracted feature space. Further, by adopting the graph embedding technique, it simultaneously preserves the intrinsic structure with multi-scale information through a constraint term. Moreover, taking into account the negative impact of the missing views on the model and the possible impact of missing views on the data inherent structure, we further propose a shielding strategy for missing views, which not only eliminates the negative effects of missing views on the model but also more accurately captures the inherent data structure. The experimental results on five widely recognized datasets indicate that the model performs better than many excellent methods.

3.
Comput Methods Programs Biomed ; 257: 108400, 2024 Sep 06.
Article in English | MEDLINE | ID: mdl-39270533

ABSTRACT

BACKGROUND AND OBJECTIVE: Accurate prognosis prediction for cancer patients plays a significant role in the formulation of treatment strategies, considerably impacting personalized medicine. Recent advancements in this field indicate that integrating information from various modalities, such as genetic and clinical data, and developing multi-modal deep learning models can enhance prediction accuracy. However, most existing multi-modal deep learning methods either overlook patient similarities that benefit prognosis prediction or fail to effectively capture diverse information due to measuring patient similarities from a single perspective. To address these issues, a novel framework called multi-modal multi-view graph convolutional networks (MMGCN) is proposed for cancer prognosis prediction. METHODS: Initially, we utilize the similarity network fusion (SNF) algorithm to merge patient similarity networks (PSNs), individually constructed using gene expression, copy number alteration, and clinical data, into a fused PSN for integrating multi-modal information. To capture diverse perspectives of patient similarities, we treat the fused PSN as a multi-view graph by considering each single-edge-type subgraph as a view graph, and propose multi-view graph convolutional networks (GCNs) with a view-level attention mechanism. Moreover, an edge homophily prediction module is designed to alleviate the adverse effects of heterophilic edges on the representation power of GCNs. Finally, comprehensive representations of patient nodes are obtained to predict cancer prognosis. RESULTS: Experimental results demonstrate that MMGCN outperforms state-of-the-art baselines on four public datasets, including METABRIC, TCGA-BRCA, TCGA-LGG, and TCGA-LUSC, with the area under the receiver operating characteristic curve achieving 0.827 ± 0.005, 0.805 ± 0.014, 0.925 ± 0.007, and 0.746 ± 0.013, respectively. CONCLUSIONS: Our study reveals the effectiveness of the proposed MMGCN, which deeply explores patient similarities related to different modalities from a broad perspective, in enhancing the performance of multi-modal cancer prognosis prediction. The source code is publicly available at https://github.com/ping-y/MMGCN.

4.
Sci Rep ; 14(1): 21136, 2024 09 10.
Article in English | MEDLINE | ID: mdl-39256414

ABSTRACT

The identification and classification of various phenotypic features of Auricularia cornea fruit bodies are crucial for quality grading and breeding efforts. The phenotypic features of Auricularia cornea fruit bodies encompass size, number, shape, color, pigmentation, and damage. These phenotypic features are distributed across various views of the fruit bodies, making the task of achieving both rapid and accurate identification and classification challenging. This paper proposes a novel multi-view multi-label fast network that integrates two different views of the Auricularia cornea fruiting body, enabling rapid and precise identification and classification of six phenotypic features simultaneously. Initially, a multi-view feature extraction model based on partial convolution was constructed. This model incorporates channel attention mechanisms to achieve rapid phenotypic feature extraction of the Auricularia cornea fruiting body. Subsequently, an efficient multi-task classifier was designed, based on class-specific residual attention, to ensure accurate classification of phenotypic features. Finally, task weights were dynamically adjusted based on heteroscedastic uncertainty, reducing the training complexity of the multi-task classification. The proposed network achieved a classification accuracy of 94.66% and an inference speed of 11.9 ms on an image dataset of dried Auricularia cornea fruiting bodies with three views and six labels. The results demonstrate that the proposed network can efficiently and accurately identify and classify all phenotypic features of Auricularia cornea.


Subject(s)
Phenotype , Basidiomycota/classification , Basidiomycota/physiology , Fruiting Bodies, Fungal , Image Processing, Computer-Assisted/methods , Algorithms , Neural Networks, Computer
5.
Brief Bioinform ; 25(6)2024 Sep 23.
Article in English | MEDLINE | ID: mdl-39344710

ABSTRACT

Epidemiologic and genetic studies in many complex diseases suggest subgroup disparities (e.g. by sex, race) in disease course and patient outcomes. We consider this from the standpoint of integrative analysis where we combine information from different views (e.g. genomics, proteomics, clinical data). Existing integrative analysis methods ignore the heterogeneity in subgroups, and stacking the views and accounting for subgroup heterogeneity does not model the association among the views. We propose Heterogeneity in Integration and Prediction (HIP), a statistical approach for joint association and prediction that leverages the strengths in each view to identify molecular signatures that are shared by and specific to a subgroup. We apply HIP to proteomics and gene expression data pertaining to chronic obstructive pulmonary disease (COPD) to identify proteins and genes shared by, and unique to, males and females, contributing to the variation in COPD, measured by airway wall thickness. Our COPD findings have identified proteins, genes, and pathways that are common across and specific to males and females, some implicated in COPD, while others could lead to new insights into sex differences in COPD mechanisms. HIP accounts for subgroup heterogeneity in multi-view data, ranks variables based on importance, is applicable to univariate or multivariate continuous outcomes, and incorporates covariate adjustment. With the efficient algorithms implemented using PyTorch, this method has many potential scientific applications and could enhance multiomics research in health disparities. HIP is available at https://github.com/lasandrall/HIP, a video tutorial at https://youtu.be/O6E2OLmeMDo and a Shiny Application at https://multi-viewlearn.shinyapps.io/HIP_ShinyApp/ for users with limited programming experience.


Subject(s)
Pulmonary Disease, Chronic Obstructive , Humans , Pulmonary Disease, Chronic Obstructive/genetics , Male , Female , Proteomics/methods , Algorithms , Genomics/methods , Computational Biology/methods
6.
Neural Netw ; 180: 106648, 2024 Aug 22.
Article in English | MEDLINE | ID: mdl-39197306

ABSTRACT

In multi-view learning, graph-based methods like Graph Convolutional Network (GCN) are extensively researched due to effective graph processing capabilities. However, most GCN-based methods often require complex preliminary operations such as sparsification, which may bring additional computation costs and training difficulties. Additionally, as the number of stacking layers increases in most GCN, over-smoothing problem arises, resulting in ineffective utilization of GCN capabilities. In this paper, we propose an attention-based stackable graph convolutional network that captures consistency across views and combines attention mechanism to exploit the powerful aggregation capability of GCN to effectively mitigate over-smoothing. Specifically, we introduce node self-attention to establish dynamic connections between nodes and generate view-specific representations. To maintain cross-view consistency, a data-driven approach is devised to assign attention weights to views, forming a common representation. Finally, based on residual connectivity, we apply an attention mechanism to the original projection features to generate layer-specific complementarity, which compensates for the information loss during graph convolution. Comprehensive experimental results demonstrate that the proposed method outperforms other state-of-the-art methods in multi-view semi-supervised tasks.

7.
Neural Netw ; 179: 106562, 2024 Nov.
Article in English | MEDLINE | ID: mdl-39142173

ABSTRACT

Multi-view learning is an emerging field of multi-modal fusion, which involves representing a single instance using multiple heterogeneous features to improve compatibility prediction. However, existing graph-based multi-view learning approaches are implemented on homogeneous assumptions and pairwise relationships, which may not adequately capture the complex interactions among real-world instances. In this paper, we design a compressed hypergraph neural network from the perspective of multi-view heterogeneous graph learning. This approach effectively captures rich multi-view heterogeneous semantic information, incorporating a hypergraph structure that simultaneously enables the exploration of higher-order correlations between samples in multi-view scenarios. Specifically, we introduce efficient hypergraph convolutional networks based on an explainable regularizer-centered optimization framework. Additionally, a low-rank approximation is adopted as hypergraphs to reformat the initial complex multi-view heterogeneous graph. Extensive experiments compared with several advanced node classification methods and multi-view classification methods have demonstrated the feasibility and effectiveness of the proposed method.


Subject(s)
Neural Networks, Computer , Algorithms , Machine Learning , Semantics , Humans
8.
Am J Hum Genet ; 111(8): 1736-1749, 2024 Aug 08.
Article in English | MEDLINE | ID: mdl-39053459

ABSTRACT

Mendelian randomization (MR) provides valuable assessments of the causal effect of exposure on outcome, yet the application of conventional MR methods for mapping risk genes encounters new challenges. One of the issues is the limited availability of expression quantitative trait loci (eQTLs) as instrumental variables (IVs), hampering the estimation of sparse causal effects. Additionally, the often context- or tissue-specific eQTL effects challenge the MR assumption of consistent IV effects across eQTL and GWAS data. To address these challenges, we propose a multi-context multivariable integrative MR framework, mintMR, for mapping expression and molecular traits as joint exposures. It models the effects of molecular exposures across multiple tissues in each gene region, while simultaneously estimating across multiple gene regions. It uses eQTLs with consistent effects across more than one tissue type as IVs, improving IV consistency. A major innovation of mintMR involves employing multi-view learning methods to collectively model latent indicators of disease relevance across multiple tissues, molecular traits, and gene regions. The multi-view learning captures the major patterns of disease relevance and uses these patterns to update the estimated tissue relevance probabilities. The proposed mintMR iterates between performing a multi-tissue MR for each gene region and joint learning the disease-relevant tissue probabilities across gene regions, improving the estimation of sparse effects across genes. We apply mintMR to evaluate the causal effects of gene expression and DNA methylation for 35 complex traits using multi-tissue QTLs as IVs. The proposed mintMR controls genome-wide inflation and offers insights into disease mechanisms.


Subject(s)
Genetic Predisposition to Disease , Genome-Wide Association Study , Mendelian Randomization Analysis , Quantitative Trait Loci , Humans , Mendelian Randomization Analysis/methods , Genome-Wide Association Study/methods , Organ Specificity/genetics , Models, Genetic , Polymorphism, Single Nucleotide
9.
Neural Netw ; 178: 106438, 2024 Oct.
Article in English | MEDLINE | ID: mdl-38906055

ABSTRACT

This paper proposes a novel approach to semantic representation learning from multi-view datasets, distinct from most existing methodologies which typically handle single-view data individually, maintaining a shared semantic link across the multi-view data via a unified optimization process. Notably, even recent advancements, such as Co-GCN, continue to treat each view as an independent graph, subsequently aggregating the respective GCN representations to form output representations, which ignores the complex semantic interactions among heterogeneous data. To address the issue, we design a unified framework to connect multi-view data with heterogeneous graphs. Specifically, our study envisions multi-view data as a heterogeneous graph composed of shared isomorphic nodes and multi-type edges, wherein the same nodes are shared across different views, but each specific view possesses its own unique edge type. This perspective motivates us to utilize the heterogeneous graph convolutional network (HGCN) to extract semantic representations from multi-view data for semi-supervised classification tasks. To the best of our knowledge, this is an early attempt to transfigure multi-view data into a heterogeneous graph within the realm of multi-view semi-supervised learning. In our approach, the original input of the HGCN is composed of concatenated multi-view matrices, and its convolutional operator (the graph Laplacian matrix) is adaptively learned from multi-type edges in a data-driven fashion. After rigorous experimentation on eight public datasets, our proposed method, hereafter referred to as HGCN-MVSC, demonstrated encouraging superiority over several state-of-the-art competitors for semi-supervised classification tasks.


Subject(s)
Neural Networks, Computer , Semantics , Supervised Machine Learning , Humans , Algorithms
10.
Brief Bioinform ; 25(4)2024 May 23.
Article in English | MEDLINE | ID: mdl-38920342

ABSTRACT

Effective molecular representation learning is very important for Artificial Intelligence-driven Drug Design because it affects the accuracy and efficiency of molecular property prediction and other molecular modeling relevant tasks. However, previous molecular representation learning studies often suffer from limitations, such as over-reliance on a single molecular representation, failure to fully capture both local and global information in molecular structure, and ineffective integration of multiscale features from different molecular representations. These limitations restrict the complete and accurate representation of molecular structure and properties, ultimately impacting the accuracy of predicting molecular properties. To this end, we propose a novel multi-view molecular representation learning method called MvMRL, which can incorporate feature information from multiple molecular representations and capture both local and global information from different views well, thus improving molecular property prediction. Specifically, MvMRL consists of four parts: a multiscale CNN-SE Simplified Molecular Input Line Entry System (SMILES) learning component and a multiscale Graph Neural Network encoder to extract local feature information and global feature information from the SMILES view and the molecular graph view, respectively; a Multi-Layer Perceptron network to capture complex non-linear relationship features from the molecular fingerprint view; and a dual cross-attention component to fuse feature information on the multi-views deeply for predicting molecular properties. We evaluate the performance of MvMRL on 11 benchmark datasets, and experimental results show that MvMRL outperforms state-of-the-art methods, indicating its rationality and effectiveness in molecular property prediction. The source code of MvMRL was released in https://github.com/jedison-github/MvMRL.


Subject(s)
Neural Networks, Computer , Algorithms , Machine Learning , Models, Molecular , Drug Design , Software , Molecular Structure , Artificial Intelligence
11.
BMC Bioinformatics ; 25(1): 188, 2024 May 14.
Article in English | MEDLINE | ID: mdl-38745112

ABSTRACT

BACKGROUND: Microbiome dysbiosis has recently been associated with different diseases and disorders. In this context, machine learning (ML) approaches can be useful either to identify new patterns or learn predictive models. However, data to be fed to ML methods can be subject to different sampling, sequencing and preprocessing techniques. Each different choice in the pipeline can lead to a different view (i.e., feature set) of the same individuals, that classical (single-view) ML approaches may fail to simultaneously consider. Moreover, some views may be incomplete, i.e., some individuals may be missing in some views, possibly due to the absence of some measurements or to the fact that some features are not available/applicable for all the individuals. Multi-view learning methods can represent a possible solution to consider multiple feature sets for the same individuals, but most existing multi-view learning methods are limited to binary classification tasks or cannot work with incomplete views. RESULTS: We propose irBoost.SH, an extension of the multi-view boosting algorithm rBoost.SH, based on multi-armed bandits. irBoost.SH solves multi-class classification tasks and can analyze incomplete views. At each iteration, it identifies one winning view using adversarial multi-armed bandits and uses its predictions to update a shared instance weight distribution in a learning process based on boosting. In our experiments, performed on 5 multi-view microbiome datasets, the model learned by irBoost.SH always outperforms the best model learned from a single view, its closest competitor rBoost.SH, and the model learned by a multi-view approach based on feature concatenation, reaching an improvement of 11.8% of the F1-score in the prediction of the Autism Spectrum disorder and of 114% in the prediction of the Colorectal Cancer disease. CONCLUSIONS: The proposed method irBoost.SH exhibited outstanding performances in our experiments, also compared to competitor approaches. The obtained results confirm that irBoost.SH can fruitfully be adopted for the analysis of microbiome data, due to its capability to simultaneously exploit multiple feature sets obtained through different sequencing and preprocessing pipelines.


Subject(s)
Algorithms , Machine Learning , Microbiota , Humans
12.
Med Image Anal ; 96: 103192, 2024 Aug.
Article in English | MEDLINE | ID: mdl-38810516

ABSTRACT

Methods to detect malignant lesions from screening mammograms are usually trained with fully annotated datasets, where images are labelled with the localisation and classification of cancerous lesions. However, real-world screening mammogram datasets commonly have a subset that is fully annotated and another subset that is weakly annotated with just the global classification (i.e., without lesion localisation). Given the large size of such datasets, researchers usually face a dilemma with the weakly annotated subset: to not use it or to fully annotate it. The first option will reduce detection accuracy because it does not use the whole dataset, and the second option is too expensive given that the annotation needs to be done by expert radiologists. In this paper, we propose a middle-ground solution for the dilemma, which is to formulate the training as a weakly- and semi-supervised learning problem that we refer to as malignant breast lesion detection with incomplete annotations. To address this problem, our new method comprises two stages, namely: (1) pre-training a multi-view mammogram classifier with weak supervision from the whole dataset, and (2) extending the trained classifier to become a multi-view detector that is trained with semi-supervised student-teacher learning, where the training set contains fully and weakly-annotated mammograms. We provide extensive detection results on two real-world screening mammogram datasets containing incomplete annotations and show that our proposed approach achieves state-of-the-art results in the detection of malignant breast lesions with incomplete annotations.


Subject(s)
Breast Neoplasms , Mammography , Radiographic Image Interpretation, Computer-Assisted , Humans , Breast Neoplasms/diagnostic imaging , Mammography/methods , Female , Radiographic Image Interpretation, Computer-Assisted/methods , Algorithms , Supervised Machine Learning
13.
Brief Bioinform ; 25(3)2024 Mar 27.
Article in English | MEDLINE | ID: mdl-38605642

ABSTRACT

MicroRNAs (miRNAs) synergize with various biomolecules in human cells resulting in diverse functions in regulating a wide range of biological processes. Predicting potential disease-associated miRNAs as valuable biomarkers contributes to the treatment of human diseases. However, few previous methods take a holistic perspective and only concentrate on isolated miRNA and disease objects, thereby ignoring that human cells are responsible for multiple relationships. In this work, we first constructed a multi-view graph based on the relationships between miRNAs and various biomolecules, and then utilized graph attention neural network to learn the graph topology features of miRNAs and diseases for each view. Next, we added an attention mechanism again, and developed a multi-scale feature fusion module, aiming to determine the optimal fusion results for the multi-view topology features of miRNAs and diseases. In addition, the prior attribute knowledge of miRNAs and diseases was simultaneously added to achieve better prediction results and solve the cold start problem. Finally, the learned miRNA and disease representations were then concatenated and fed into a multi-layer perceptron for end-to-end training and predicting potential miRNA-disease associations. To assess the efficacy of our model (called MUSCLE), we performed 5- and 10-fold cross-validation (CV), which got average the Area under ROC curves of 0.966${\pm }$0.0102 and 0.973${\pm }$0.0135, respectively, outperforming most current state-of-the-art models. We then examined the impact of crucial parameters on prediction performance and performed ablation experiments on the feature combination and model architecture. Furthermore, the case studies about colon cancer, lung cancer and breast cancer also fully demonstrate the good inductive capability of MUSCLE. Our data and code are free available at a public GitHub repository: https://github.com/zht-code/MUSCLE.git.


Subject(s)
Colonic Neoplasms , Lung Neoplasms , MicroRNAs , Humans , Muscles , Learning , MicroRNAs/genetics , Algorithms , Computational Biology
14.
PeerJ Comput Sci ; 10: e1874, 2024.
Article in English | MEDLINE | ID: mdl-38481705

ABSTRACT

Epilepsy is a chronic, non-communicable disease caused by paroxysmal abnormal synchronized electrical activity of brain neurons, and is one of the most common neurological diseases worldwide. Electroencephalography (EEG) is currently a crucial tool for epilepsy diagnosis. With the development of artificial intelligence, multi-view learning-based EEG analysis has become an important method for automatic epilepsy recognition because EEG contains difficult types of features such as time-frequency features, frequency-domain features and time-domain features. However, current multi-view learning still faces some challenges, such as the difference between samples of the same class from different views is greater than the difference between samples of different classes from the same view. In view of this, in this study, we propose a shared hidden space-driven multi-view learning algorithm. The algorithm uses kernel density estimation to construct a shared hidden space and combines the shared hidden space with the original space to obtain an expanded space for multi-view learning. By constructing the expanded space and utilizing the information of both the shared hidden space and the original space for learning, the relevant information of samples within and across views can thereby be fully utilized. Experimental results on a dataset of epilepsy provided by the University of Bonn show that the proposed algorithm has promising performance, with an average classification accuracy value of 0.9787, which achieves at least 4% improvement compared to single-view methods.

15.
Int J Biol Macromol ; 264(Pt 2): 130638, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38460652

ABSTRACT

The rational modification of siRNA molecules is crucial for ensuring their drug-like properties. Machine learning-based prediction of chemically modified siRNA (cm-siRNA) efficiency can significantly optimize the design process of siRNA chemical modifications, saving time and cost in siRNA drug development. However, existing in-silico methods suffer from limitations such as small datasets, inadequate data representation capabilities, and lack of interpretability. Therefore, in this study, we developed the Cm-siRPred algorithm based on a multi-view learning strategy. The algorithm employs a multi-view strategy to represent the double-strand sequences, chemical modifications, and physicochemical properties of cm-siRNA. It incorporates a cross-attention model to globally correlate different representation vectors and a two-layer CNN module to learn local correlation features. The algorithm demonstrates exceptional performance in cross-validation experiments, independent dataset, and case studies on approved siRNA drugs, and showcasing its robustness and generalization ability. In addition, we developed a user-friendly webserver that enables efficient prediction of cm-siRNA efficiency and assists in the design of siRNA drug chemical modifications. In summary, Cm-siRPred is a practical tool that offers valuable technical support for siRNA chemical modification and drug efficiency research, while effectively assisting in the development of novel small nucleic acid drugs. Cm-siRPred is freely available at https://cellknowledge.com.cn/sirnapredictor/.


Subject(s)
Algorithms , Machine Learning , RNA, Small Interfering/genetics , RNA, Small Interfering/chemistry
16.
Comput Med Imaging Graph ; 114: 102371, 2024 06.
Article in English | MEDLINE | ID: mdl-38513397

ABSTRACT

Knee OsteoArthritis (OA) is a prevalent chronic condition, affecting a significant proportion of the global population. Detecting knee OA is crucial as the degeneration of the knee joint is irreversible. In this paper, we introduce a semi-supervised multi-view framework and a 3D CNN model for detecting knee OA using 3D Magnetic Resonance Imaging (MRI) scans. We introduce a semi-supervised learning approach combining labeled and unlabeled data to improve the performance and generalizability of the proposed model. Experimental results show the efficacy of our proposed approach in detecting knee OA from 3D MRI scans using a large cohort of 4297 subjects. An ablation study was conducted to investigate the contributions of various components of the proposed model, providing insights into the optimal design of the model. Our results indicate the potential of the proposed approach to improve the accuracy and efficiency of OA diagnosis. The proposed framework reported an AUC of 93.20% for the detection of knee OA.


Subject(s)
Osteoarthritis, Knee , Humans , Osteoarthritis, Knee/diagnostic imaging , Knee Joint/diagnostic imaging , Magnetic Resonance Imaging/methods
17.
Comput Biol Med ; 171: 108087, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38364658

ABSTRACT

Thyroid nodule classification and segmentation in ultrasound images are crucial for computer-aided diagnosis; however, they face limitations owing to insufficient labeled data. In this study, we proposed a multi-view contrastive self-supervised method to improve thyroid nodule classification and segmentation performance with limited manual labels. Our method aligns the transverse and longitudinal views of the same nodule, thereby enabling the model to focus more on the nodule area. We designed an adaptive loss function that eliminates the limitations of the paired data. Additionally, we adopted a two-stage pre-training to exploit the pre-training on ImageNet and thyroid ultrasound images. Extensive experiments were conducted on a large-scale dataset collected from multiple centers. The results showed that the proposed method significantly improves nodule classification and segmentation performance with limited manual labels and outperforms state-of-the-art self-supervised methods. The two-stage pre-training also significantly exceeded ImageNet pre-training.


Subject(s)
Thyroid Nodule , Humans , Thyroid Nodule/diagnostic imaging , Diagnosis, Computer-Assisted , Ultrasonography , Supervised Machine Learning , Image Processing, Computer-Assisted
18.
Sensors (Basel) ; 24(2)2024 Jan 18.
Article in English | MEDLINE | ID: mdl-38257712

ABSTRACT

Federated learning (FL) is a privacy-preserving collective machine learning paradigm. Vertical federated learning (VFL) deals with the case where participants share the same sample ID space but have different feature spaces, while label information is owned by one participant. Early studies of VFL supported two participants and focused on binary-class logistic regression problems, while recent studies have put more attention on specific aspects such as communication efficiency and data security. In this paper, we propose the multi-participant multi-class vertical federated learning (MMVFL) framework for multi-class VFL problems involving multiple parties. By extending the idea of multi-view learning (MVL), MMVFL enables label sharing from its owner to other VFL participants in a privacy-preserving manner. To demonstrate the effectiveness of MMVFL, a feature selection scheme is incorporated into MMVFL to compare its performance against supervised feature selection and MVL-based approaches. The proposed framework is capable of quantifying feature importance and measuring participant contributions. It is also simple and easy to combine with other communication and security techniques. The experiment results on feature selection for classification tasks on real-world datasets show that MMVFL can effectively share label information among multiple VFL participants and match the multi-class classification performance of existing approaches.

19.
Comput Biol Med ; 169: 107898, 2024 Feb.
Article in English | MEDLINE | ID: mdl-38176210

ABSTRACT

Accurate segmentation of the thyroid gland in ultrasound images is an essential initial step in distinguishing between benign and malignant nodules, thus facilitating early diagnosis. Most existing deep learning-based methods to segment thyroid nodules are learned from only a single view or two views, which limits the performance of segmenting nodules at different scales in complex ultrasound scanning environments. To address this limitation, this study proposes a multi-view learning model, abbreviated as MLMSeg. First, a deep convolutional neural network is introduced to encode the features of the local view. Second, a multi-channel transformer module is designed to capture long-range dependency correlations of global view between different nodules. Third, there are semantic relationships of structural view between features of different layers. For example, low-level features and high-level features are endowed with hidden relationships in the feature space. To this end, a cross-layer graph convolutional module is proposed to adaptively learn the correlations of high-level and low-level features by constructing graphs across different layers. In addition, in the view fusion, a channel-aware graph attention block is devised to fuse the features from the aforementioned views for accurate segmentation of thyroid nodules. To demonstrate the effectiveness of the proposed method, extensive comparative experiments were conducted with 14 baseline methods. MLMSeg achieved higher Dice coefficients (92.10% and 83.84%) and Intersection over Union scores (86.60% and 73.52%) on two different thyroid datasets. The exceptional segmentation capability of MLMSeg for thyroid nodules can greatly assist in localizing thyroid nodules and facilitating more precise measurements of their transverse and longitudinal diameters, which is of significant clinical relevance for the diagnosis of thyroid nodules.


Subject(s)
Thyroid Nodule , Humans , Ultrasonography , Neural Networks, Computer , Semantics , Image Processing, Computer-Assisted
20.
Comput Biol Med ; 170: 107941, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38217976

ABSTRACT

Immunotherapy is an emerging treatment method aimed at activating the human immune system and relying on its own immune function to kill cancer cells and tumor tissues. It has the advantages of wide applicability and minimal side effects. Effective identification of tumor T cell antigens (TTCAs) will help researchers understand their functions and mechanisms and carry out research on anti-tumor vaccine development. Considering that using biological experimental technology to identify TTCAs can be costly and time-consuming, it is necessary to develop a robust bioinformatics computing tool. At present, different machine learning models have been proposed for identifying TTCAs, but there is still room for further improvement in their performance. To establish a TTCA predictor with better prediction performance, we propose a prediction model called iTTCA-MVL in this paper. We extracted three sets of features from the views of physicochemical information and sequence statistics, namely the distribution descriptor of composition, transition, and distribution (CTDD), TF-IDF, and LSA topic. Then, we used least squares support vector machines (LSSVMs) as submodels and Hilbert‒Schmidt independence criteria (HSIC) as constraints to establish an independent and complementary multi-view learning model. The prediction accuracy of iTTCA-MVL on the independent test set is 0.873, and Matthew's correlation coefficient is 0.747, which is significantly better than those of existing methods. Therefore, iTTCA-MVL is an excellent prediction tool that researchers can use to accurately identify TTCAs.


Subject(s)
Computational Biology , Machine Learning , Humans , Computational Biology/methods , T-Lymphocytes
SELECTION OF CITATIONS
SEARCH DETAIL