Search | VHL Regional Portal

1.

Life regression based patch slimming for vision transformers.

Chen, Jiawei; Chen, Lin; Yang, Jiang; Shi, Tianqi; Cheng, Lechao; Feng, Zunlei; Song, Mingli.

Neural Netw ; 176: 106340, 2024 Apr 25.

Article in English | MEDLINE | ID: mdl-38713967

ABSTRACT

Vision transformers have achieved remarkable success in computer vision tasks by using multi-head self-attention modules to capture long-range dependencies within images. However, the high inference computation cost poses a new challenge. Several methods have been proposed to address this problem, mainly by slimming patches. In the inference stage, these methods classify patches into two classes, one to keep and the other to discard in multiple layers. This approach results in additional computation at every layer where patches are discarded, which hinders inference acceleration. In this study, we tackle the patch slimming problem from a different perspective by proposing a life regression module that determines the lifespan of each image patch in one go. During inference, the patch is discarded once the current layer index exceeds its life. Our proposed method avoids additional computation and parameters in multiple layers to enhance inference speed while maintaining competitive performance. Additionally, our approach1 requires fewer training epochs than other patch slimming methods.

2.

Interaction Pattern Disentangling for Multi-Agent Reinforcement Learning.

Liu, Shunyu; Song, Jie; Zhou, Yihe; Yu, Na; Chen, Kaixuan; Feng, Zunlei; Song, Mingli.

IEEE Trans Pattern Anal Mach Intell ; PP2024 May 13.

Article in English | MEDLINE | ID: mdl-38739512

ABSTRACT

Deep cooperative multi-agent reinforcement learning has demonstrated its remarkable success over a wide spectrum of complex control tasks. However, recent advances in multi-agent learning mainly focus on value decomposition while leaving entity interactions still intertwined, which easily leads to over-fitting on noisy interactions between entities. In this work, we introduce a novel interactiOn Pattern disenTangling (OPT) method, to disentangle the entity interactions into interaction prototypes, each of which represents an underlying interaction pattern within a subgroup of the entities. OPT facilitates filtering the noisy interactions between irrelevant entities and thus significantly improves generalizability as well as interpretability. Specifically, OPT introduces a sparse disagreement mechanism to encourage sparsity and diversity among discovered interaction prototypes. Then the model selectively restructures these prototypes into a compact interaction pattern by an aggregator with learnable weights. To alleviate the training instability issue caused by partial observability, we propose to maximize the mutual information between the aggregation weights and the history behaviors of each agent. Experiments on single-task, multi-task and zero-shot benchmarks demonstrate that the proposed method yields results superior to the state-of-the-art counterparts. Our code is available at https://github.com/liushunyu/OPT.

3.

Fast and effective molecular property prediction with transferability map.

Yao, Shaolun; Song, Jie; Jia, Lingxiang; Cheng, Lechao; Zhong, Zipeng; Song, Mingli; Feng, Zunlei.

Commun Chem ; 7(1): 85, 2024 Apr 17.

Article in English | MEDLINE | ID: mdl-38632308

ABSTRACT

Effective transfer learning for molecular property prediction has shown considerable strength in addressing insufficient labeled molecules. Many existing methods either disregard the quantitative relationship between source and target properties, risking negative transfer, or require intensive training on target tasks. To quantify transferability concerning task-relatedness, we propose Principal Gradient-based Measurement (PGM) for transferring molecular property prediction ability. First, we design an optimization-free scheme to calculate a principal gradient for approximating the direction of model optimization on a molecular property prediction dataset. We have analyzed the close connection between the principal gradient and model optimization through mathematical proof. PGM measures the transferability as the distance between the principal gradient obtained from the source dataset and that derived from the target dataset. Then, we perform PGM on various molecular property prediction datasets to build a quantitative transferability map for source dataset selection. Finally, we evaluate PGM on multiple combinations of transfer learning tasks across 12 benchmark molecular property prediction datasets and demonstrate that it can serve as fast and effective guidance to improve the performance of a target task. This work contributes to more efficient discovery of drugs, materials, and catalysts by offering a task-relatedness quantification prior to transfer learning and understanding the relationship between chemical properties.

4.

Deep learning-based accurate diagnosis and quantitative evaluation of microvascular invasion in hepatocellular carcinoma on whole-slide histopathology images.

Zhang, Xiuming; Yu, Xiaotian; Liang, Wenjie; Zhang, Zhongliang; Zhang, Shengxuming; Xu, Linjie; Zhang, Han; Feng, Zunlei; Song, Mingli; Zhang, Jing; Feng, Shi.

Cancer Med ; 13(5): e7104, 2024 Mar.

Article in English | MEDLINE | ID: mdl-38488408

ABSTRACT

BACKGROUND: Microvascular invasion (MVI) is an independent prognostic factor that is associated with early recurrence and poor survival after resection of hepatocellular carcinoma (HCC). However, the traditional pathology approach is relatively subjective, time-consuming, and heterogeneous in the diagnosis of MVI. The aim of this study was to develop a deep-learning model that could significantly improve the efficiency and accuracy of MVI diagnosis. MATERIALS AND METHODS: We collected H&E-stained slides from 753 patients with HCC at the First Affiliated Hospital of Zhejiang University. An external validation set with 358 patients was selected from The Cancer Genome Atlas database. The deep-learning model was trained by simulating the method used by pathologists to diagnose MVI. Model performance was evaluated with accuracy, precision, recall, F1 score, and the area under the receiver operating characteristic curve. RESULTS: We successfully developed a MVI artificial intelligence diagnostic model (MVI-AIDM) which achieved an accuracy of 94.25% in the independent external validation set. The MVI positive detection rate of MVI-AIDM was significantly higher than the results of pathologists. Visualization results demonstrated the recognition of micro MVIs that were difficult to differentiate by the traditional pathology. Additionally, the model provided automatic quantification of the number of cancer cells and spatial information regarding MVI. CONCLUSIONS: We developed a deep learning diagnostic model, which performed well and improved the efficiency and accuracy of MVI diagnosis. The model provided spatial information of MVI that was essential to accurately predict HCC recurrence after surgery.

Subject(s)

Carcinoma, Hepatocellular , Deep Learning , Liver Neoplasms , Humans , Carcinoma, Hepatocellular/pathology , Liver Neoplasms/pathology , Artificial Intelligence , Retrospective Studies , Neoplasm Invasiveness

5.

HairStyle Editing via Parametric Controllable Strokes.

Song, Xinhui; Liu, Chen; Zheng, Youyi; Feng, Zunlei; Li, Lincheng; Zhou, Kun; Yu, Xin.

IEEE Trans Vis Comput Graph ; PP2023 Feb 03.

Article in English | MEDLINE | ID: mdl-37022457

ABSTRACT

In this work, we propose a stroke-based hairstyle editing network, dubbed HairstyleNet, allowing users to conveniently change the hairstyles of an image in an interactive fashion. Different from previous works, we simplify the hairstyle editing process where users can manipulate local or entire hairstyles by adjusting the parameterized hair regions. Our HairstyleNet consists of two stages: a stroke parameterization stage and a stroke-to-hair generation stage. In the stroke parameterization stage, we firstly introduce parametric strokes to approximate the hair wisps, where the stroke shape is controlled by a quadratic Bézier curve and a thickness parameter. Since rendering strokes with thickness to an image is not differentiable, we opt to leverage a neural renderer to construct the mapping from stroke parameters to a stroke image. Thus, the stroke parameters can be directly estimated from hair regions in a differentiable way, enabling us to flexibly edit the hairstyles of input images. In the stroke-to-hair generation stage, we design a hairstyle refinement network that first encodes coarsely composed images of hair strokes, face, and background into latent representations and then generates high-fidelity face images with desirable new hairstyles from the latent codes. Extensive experiments demonstrate that our HairstyleNet achieves state-of-the-art performance and allows flexible hairstyle manipulation.

6.

Conservative-Progressive Collaborative Learning for Semi-Supervised Semantic Segmentation.

Fan, Siqi; Zhu, Fenghua; Feng, Zunlei; Lv, Yisheng; Song, Mingli; Wang, Fei-Yue.

IEEE Trans Image Process ; 32: 6183-6194, 2023.

Article in English | MEDLINE | ID: mdl-37022902

ABSTRACT

Pseudo supervision is regarded as the core idea in semi-supervised learning for semantic segmentation, and there is always a tradeoff between utilizing only the high-quality pseudo labels and leveraging all the pseudo labels. Addressing that, we propose a novel learning approach, called Conservative-Progressive Collaborative Learning (CPCL), among which two predictive networks are trained in parallel, and the pseudo supervision is implemented based on both the agreement and disagreement of the two predictions. One network seeks common ground via intersection supervision and is supervised by the high-quality labels to ensure a more reliable supervision, while the other network reserves differences via union supervision and is supervised by all the pseudo labels to keep exploring with curiosity. Thus, the collaboration of conservative evolution and progressive exploration can be achieved. To reduce the influences of the suspicious pseudo labels, the loss is dynamic re-weighted according to the prediction confidence. Extensive experiments demonstrate that CPCL achieves state-of-the-art performance for semi-supervised semantic segmentation.

7.

Knowledge Amalgamation for Object Detection With Transformers.

Zhang, Haofei; Mao, Feng; Xue, Mengqi; Fang, Gongfan; Feng, Zunlei; Song, Jie; Song, Mingli.

IEEE Trans Image Process ; 32: 2093-2106, 2023.

Article in English | MEDLINE | ID: mdl-37023145

ABSTRACT

Knowledge amalgamation (KA) is a novel deep model reusing task aiming to transfer knowledge from several well-trained teachers to a multi-talented and compact student. Currently, most of these approaches are tailored for convolutional neural networks (CNNs). However, there is a tendency that Transformers, with a completely different architecture, are starting to challenge the domination of CNNs in many computer vision tasks. Nevertheless, directly applying the previous KA methods to Transformers leads to severe performance degradation. In this work, we explore a more effective KA scheme for Transformer-based object detection models. Specifically, considering the architecture characteristics of Transformers, we propose to dissolve the KA into two aspects: sequence-level amalgamation (SA) and task-level amalgamation (TA). In particular, a hint is generated within the sequence-level amalgamation by concatenating teacher sequences instead of redundantly aggregating them to a fixed-size one as previous KA approaches. Besides, the student learns heterogeneous detection tasks through soft targets with efficiency in the task-level amalgamation. Extensive experiments on PASCAL VOC and COCO have unfolded that the sequence-level amalgamation significantly boosts the performance of students, while the previous methods impair the students. Moreover, the Transformer-based students excel in learning amalgamated knowledge, as they have mastered heterogeneous detection tasks rapidly and achieved superior or at least comparable performance to those of the teachers in their specializations.

8.

Transition Propagation Graph Neural Networks for Temporal Networks.

Zheng, Tongya; Feng, Zunlei; Zhang, Tianli; Hao, Yunzhi; Song, Mingli; Wang, Xingen; Wang, Xinyu; Zhao, Ji; Chen, Chun.

IEEE Trans Neural Netw Learn Syst ; PP2022 Nov 18.

Article in English | MEDLINE | ID: mdl-36399591

ABSTRACT

Researchers of temporal networks (e.g., social networks and transaction networks) have been interested in mining dynamic patterns of nodes from their diverse interactions. Inspired by recently powerful graph mining methods like skip-gram models and graph neural networks (GNNs), existing approaches focus on generating temporal node embeddings sequentially with nodes' sequential interactions. However, the sequential modeling of previous approaches cannot handles the transition structure between nodes' neighbors with limited memorization capacity. In detail, an effective method for the transition structures is required to both model nodes' personalized patterns adaptively and capture node dynamics accordingly. In this article, we propose a method, namely transition propagation graph neural networks (TIP-GNN), to tackle the challenges of encoding nodes' transition structures. The proposed TIP-GNN focuses on the bilevel graph structure in temporal networks: besides the explicit interaction graph, a node's sequential interactions can also be constructed as a transition graph. Based on the bilevel graph, TIP-GNN further encodes transition structures by multistep transition propagation and distills information from neighborhoods by a bilevel graph convolution. Experimental results over various temporal networks reveal the efficiency of our TIP-GNN, with at most 7.2% improvements of accuracy on temporal link prediction. Extensive ablation studies further verify the effectiveness and limitations of the transition propagation module. Our code is available at https://github.com/doujiang-zheng/TIP-GNN.

9.

Root-aligned SMILES: a tight representation for chemical reaction prediction.

Zhong, Zipeng; Song, Jie; Feng, Zunlei; Liu, Tiantao; Jia, Lingxiang; Yao, Shaolun; Wu, Min; Hou, Tingjun; Song, Mingli.

Chem Sci ; 13(31): 9023-9034, 2022 Aug 10.

Article in English | MEDLINE | ID: mdl-36091202

ABSTRACT

Chemical reaction prediction, involving forward synthesis and retrosynthesis prediction, is a fundamental problem in organic synthesis. A popular computational paradigm formulates synthesis prediction as a sequence-to-sequence translation problem, where the typical SMILES is adopted for molecule representations. However, the general-purpose SMILES neglects the characteristics of chemical reactions, where the molecular graph topology is largely unaltered from reactants to products, resulting in the suboptimal performance of SMILES if straightforwardly applied. In this article, we propose the root-aligned SMILES (R-SMILES), which specifies a tightly aligned one-to-one mapping between the product and the reactant SMILES for more efficient synthesis prediction. Due to the strict one-to-one mapping and reduced edit distance, the computational model is largely relieved from learning the complex syntax and dedicated to learning the chemical knowledge for reactions. We compare the proposed R-SMILES with various state-of-the-art baselines and show that it significantly outperforms them all, demonstrating the superiority of the proposed method.

10.

Deep learning based diagnosis for cysts and tumors of jaw with massive healthy samples.

Yu, Dan; Hu, Jiacong; Feng, Zunlei; Song, Mingli; Zhu, Huiyong.

Sci Rep ; 12(1): 1855, 2022 02 03.

Article in English | MEDLINE | ID: mdl-35115624

ABSTRACT

We aimed to develop an explainable and reliable method to diagnose cysts and tumors of the jaw with massive panoramic radiographs of healthy peoples based on deep learning, since collecting and labeling massive lesion samples are time-consuming, and existing deep learning-based methods lack explainability. Based on the collected 872 lesion samples and 10,000 healthy samples, a two-branch network was proposed for classifying the cysts and tumors of the jaw. The two-branch network is firstly pretrained on massive panoramic radiographs of healthy peoples, then is trained for classifying the sample categories and segmenting the lesion area. Totally, 200 healthy samples and 87 lesion samples were included in the testing stage. The average accuracy, precision, sensitivity, specificity, and F1 score of classification are 88.72%, 65.81%, 66.56%, 92.66%, and 66.14%, respectively. The average accuracy, precision, sensitivity, specificity, and F1 score of classification will reach 90.66%, 85.23%, 84.27%, 93.50%, and 84.74%, if only classifying the lesion samples and healthy samples. The proposed method showed encouraging performance in the diagnosis of cysts and tumors of the jaw. The classified categories and segmented lesion areas serve as the diagnostic basis for further diagnosis, which provides a reliable tool for diagnosing jaw tumors and cysts.

Subject(s)

Deep Learning , Jaw Cysts/diagnostic imaging , Jaw Neoplasms/diagnostic imaging , Radiographic Image Interpretation, Computer-Assisted , Radiography, Panoramic , Case-Control Studies , Humans , Predictive Value of Tests , Reproducibility of Results

11.

Development of a Deep Learning Model to Assist With Diagnosis of Hepatocellular Carcinoma.

Feng, Shi; Yu, Xiaotian; Liang, Wenjie; Li, Xuejie; Zhong, Weixiang; Hu, Wanwan; Zhang, Han; Feng, Zunlei; Song, Mingli; Zhang, Jing; Zhang, Xiuming.

Front Oncol ; 11: 762733, 2021.

Article in English | MEDLINE | ID: mdl-34926264

ABSTRACT

BACKGROUND: An accurate pathological diagnosis of hepatocellular carcinoma (HCC), one of the malignant tumors with the highest mortality rate, is time-consuming and heavily reliant on the experience of a pathologist. In this report, we proposed a deep learning model that required minimal noise reduction or manual annotation by an experienced pathologist for HCC diagnosis and classification. METHODS: We collected a whole-slide image of hematoxylin and eosin-stained pathological slides from 592 HCC patients at the First Affiliated Hospital, College of Medicine, Zhejiang University between 2015 and 2020. We propose a noise-specific deep learning model. The model was trained initially with 137 cases cropped into multiple-scaled datasets. Patch screening and dynamic label smoothing strategies are adopted to handle the histopathological liver image with noise annotation from the perspective of input and output. The model was then tested in an independent cohort of 455 cases with comparable tumor types and differentiations. RESULTS: Exhaustive experiments demonstrated that our two-step method achieved 87.81% pixel-level accuracy and 98.77% slide-level accuracy in the test dataset. Furthermore, the generalization performance of our model was also verified using The Cancer Genome Atlas dataset, which contains 157 HCC pathological slides, and achieved an accuracy of 87.90%. CONCLUSIONS: The noise-specific histopathological classification model of HCC based on deep learning is effective for the dataset with noisy annotation, and it significantly improved the pixel-level accuracy of the regular convolutional neural network (CNN) model. Moreover, the model also has an advantage in detecting well-differentiated HCC and microvascular invasion.

12.

Neural Style Transfer: A Review.

Jing, Yongcheng; Yang, Yezhou; Feng, Zunlei; Ye, Jingwen; Yu, Yizhou; Song, Mingli.

IEEE Trans Vis Comput Graph ; 26(11): 3365-3385, 2020 11.

Article in English | MEDLINE | ID: mdl-31180860

ABSTRACT

The seminal work of Gatys et al. demonstrated the power of Convolutional Neural Networks (CNNs) in creating artistic imagery by separating and recombining image content and style. This process of using CNNs to render a content image in different styles is referred to as Neural Style Transfer (NST). Since then, NST has become a trending topic both in academic literature and industrial applications. It is receiving increasing attention and a variety of approaches are proposed to either improve or extend the original NST algorithm. In this paper, we aim to provide a comprehensive overview of the current progress towards NST. We first propose a taxonomy of current algorithms in the field of NST. Then, we present several evaluation methods and compare different NST algorithms both qualitatively and quantitatively. The review concludes with a discussion of various applications of NST and open problems for future research. A list of papers discussed in this review, corresponding codes, pre-trained models and more comparison results are publicly available at: https://osf.io/f8tu4/.

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL