|

1.

SUCCESSIVE SUBSPACE LEARNING FOR CARDIAC DISEASE CLASSIFICATION WITH TWO-PHASE DEFORMATION FIELDS FROM CINE MRI.

Liu, Xiaofeng; Xing, Fangxu; Gaggin, Hanna K; Kuo, C-C Jay; El Fakhri, Georges; Woo, Jonghye.

Proc IEEE Int Symp Biomed Imaging ; 20232023 Apr.

Article En | MEDLINE | ID: mdl-38031559

Cardiac cine magnetic resonance imaging (MRI) has been used to characterize cardiovascular diseases (CVD), often providing a noninvasive phenotyping tool. While recently flourished deep learning based approaches using cine MRI yield accurate characterization results, the performance is often degraded by small training samples. In addition, many deep learning models are deemed a "black box," for which models remain largely elusive in how models yield a prediction and how reliable they are. To alleviate this, this work proposes a lightweight successive subspace learning (SSL) framework for CVD classification, based on an interpretable feedforward design, in conjunction with a cardiac atlas. Specifically, our hierarchical SSL model is based on (i) neighborhood voxel expansion, (ii) unsupervised subspace approximation, (iii) supervised regression, and (iv) multi-level feature integration. In addition, using two-phase 3D deformation fields, including end-diastolic and end-systolic phases, derived between the atlas and individual subjects as input offers objective means of assessing CVD, even with small training samples. We evaluate our framework on the ACDC2017 database, comprising one healthy group and four disease groups. Compared with 3D CNN-based approaches, our framework achieves superior classification performance with 140× fewer parameters, which supports its potential value in clinical use.

2.

Perceptually Weighted Rate Distortion Optimization for Video-Based Point Cloud Compression.

Zhang, Yun; Ding, Keqin; Li, Na; Wang, Hanli; Huang, Xiaoxia; Kuo, C-C Jay.

IEEE Trans Image Process ; 32: 5933-5947, 2023.

Article En | MEDLINE | ID: mdl-37903048

Dynamic point cloud is a volumetric visual data representing realistic 3D scenes for virtual reality and augmented reality applications. However, its large data volume has been the bottleneck of data processing, transmission, and storage, which requires effective compression. In this paper, we propose a Perceptually Weighted Rate-Distortion Optimization (PWRDO) scheme for Video-based Point Cloud Compression (V-PCC), which aims to minimize the perceptual distortion of reconstructed point cloud at the given bit rate. Firstly, we propose a general framework of perceptually optimized V-PCC to exploit visual redundancies in point clouds. Secondly, a multi-scale Projection based Point Cloud quality Metric (PPCM) is proposed to measure the perceptual quality of 3D point cloud. The PPCM model comprises 3D-to-2D patch projection, multi-scale structural distortion measurement, and fusion model. Approximations and simplifications of the proposed PPCM are also presented for both V-PCC integration and low complexity. Thirdly, based on the simplified PPCM model, we propose a PWRDO scheme with Lagrange multiplier adaptation, which is incorporated into the V-PCC to enhance the coding efficiency. Experimental results show that the proposed PPCM models can be used as standalone quality metrics, and they are able to achieve higher consistency with the human subjective scores than the state-of-the-art objective visual quality metrics. Also, compared with the latest V-PCC reference model, the proposed PWRDO-based V-PCC scheme achieves an average bit rate reduction of 13.52%, 8.16%, 10.56% and 9.54%, respectively, in terms of four objective visual quality metrics for point clouds. It is significantly superior to the state-of-the-art coding algorithms. The computational complexity of the proposed PWRDO increases by 1.71% and 0.05% on average to the V-PCC encoder and decoder, respectively, which is negligible. The source codes of the PPCM and PWRDO schemes are available at https://github.com/VVCodec/PPCM-PWRDO.

3.

Label Efficient Regularization and Propagation for Graph Node Classification.

Xie, Tian; Kannan, Rajgopal; Kuo, C-C Jay.

IEEE Trans Pattern Anal Mach Intell ; 45(12): 14856-14871, 2023 Dec.

Article En | MEDLINE | ID: mdl-37647182

An enhanced label propagation (LP) method called GraphHop was proposed recently. It outperforms graph convolutional networks (GCNs) in the semi-supervised node classification task on various networks. Although the performance of GraphHop was explained intuitively with joint node attribute and label signal smoothening, its rigorous mathematical treatment is lacking. In this paper, we propose a label efficient regularization and propagation (LERP) framework for graph node classification, and present an alternate optimization procedure for its solution. Furthermore, we show that GraphHop only offers an approximate solution to this framework and has two drawbacks. First, it includes all nodes in the classifier training without taking the reliability of pseudo-labeled nodes into account in the label update step. Second, it provides a rough approximation to the optimum of a subproblem in the label aggregation step. Based on the LERP framework, we propose a new method, named the LERP method, to solve these two shortcomings. LERP determines reliable pseudo-labels adaptively during the alternate optimization and provides a better approximation to the optimum with computational efficiency. Theoretical convergence of LERP is guaranteed. Extensive experiments are conducted to demonstrate the effectiveness and efficiency of LERP. That is, LERP outperforms all benchmarking methods, including GraphHop, consistently on five common test datasets, two large-scale networks, and an object recognition task at extremely low label rates (i.e., 1, 2, 4, 8, 16, and 20 labeled samples per class).

4.

PRISMA AI reporting guidelines for systematic reviews and meta-analyses on AI in healthcare.

Cacciamani, Giovanni E; Chu, Timothy N; Sanford, Daniel I; Abreu, Andre; Duddalwar, Vinay; Oberai, Assad; Kuo, C-C Jay; Liu, Xiaoxuan; Denniston, Alastair K; Vasey, Baptiste; McCulloch, Peter; Wolff, Robert F; Mallett, Sue; Mongan, John; Kahn, Charles E; Sounderajah, Viknesh; Darzi, Ara; Dahm, Philipp; Moons, Karel G M; Topol, Eric; Collins, Gary S; Moher, David; Gill, Inderbir S; Hung, Andrew J.

Nat Med ; 29(1): 14-15, 2023 01.

Article En | MEDLINE | ID: mdl-36646804

Checklist , Publishing , Delivery of Health Care , Artificial Intelligence

5.

GraphHop: An Enhanced Label Propagation Method for Node Classification.

Xie, Tian; Wang, Bin; Kuo, C-C Jay.

IEEE Trans Neural Netw Learn Syst ; 34(11): 9287-9301, 2023 Nov.

Article En | MEDLINE | ID: mdl-35302944

A scalable semisupervised node classification method on graph-structured data, called GraphHop, is proposed in this work. The graph contains all nodes' attributes and link connections but labels of only a subset of nodes. Graph convolutional networks (GCNs) have provided superior performance in node label classification over the traditional label propagation (LP) methods for this problem. Nevertheless, current GCN algorithms suffer from a considerable amount of labels for training because of high model complexity or cannot be easily generalized to large-scale graphs due to the expensive cost of loading the entire graph and node embeddings. Besides, nonlinearity makes the optimization process a mystery. To this end, an enhanced LP method, called GraphHop, is proposed to tackle these problems. GraphHop can be viewed as a smoothening LP algorithm, in which each propagation alternates between two steps: label aggregation and label update. In the label aggregation step, multihop neighbor embeddings are aggregated to the center node. In the label update step, new embeddings are learned and predicted for each node based on aggregated results from the previous step. The two-step iteration improves the graph signal smoothening capacity. Furthermore, to encode attributes, links, and labels on graphs effectively under one framework, we adopt a two-stage training process, i.e., the initialization stage and the iteration stage. Thus, the smooth attribute information extracted from the initialization stage is consistently imposed in the propagation process in the iteration stage. Experimental results show that GraphHop outperforms state-of-the-art graph learning methods on a wide range of tasks in graphs of various sizes (e.g., multilabel and multiclass classification on citation networks, social graphs, and commodity consumption graphs).

6.

L-BGNN: Layerwise Trained Bipartite Graph Neural Networks.

Xie, Tian; He, Chaoyang; Ren, Xiang; Shahabi, Cyrus; Kuo, C-C Jay.

IEEE Trans Neural Netw Learn Syst ; 34(12): 10711-10723, 2023 Dec.

Article En | MEDLINE | ID: mdl-35544501

Learning low-dimensional representations of bipartite graphs enables e-commerce applications, such as recommendation, classification, and link prediction. A layerwise-trained bipartite graph neural network (L-BGNN) embedding method, which is unsupervised, efficient, and scalable, is proposed in this work. To aggregate the information across and within two partitions of a bipartite graph, a customized interdomain message passing (IDMP) operation and an intradomain alignment (IDA) operation are adopted by the proposed L-BGNN method. Furthermore, we develop a layerwise training algorithm for L-BGNN to capture the multihop relationship of large bipartite networks and improve training efficiency. We conduct extensive experiments on several datasets and downstream tasks of various scales to demonstrate the effectiveness and efficiency of the L-BGNN method as compared with state-of-the-art methods. Our codes are publicly available at https://github.com/TianXieUSC/L-BGNN.

7.

Unsupervised Domain Adaptation for Segmentation with Black-box Source Model.

Liu, Xiaofeng; Yoo, Chaehwa; Xing, Fangxu; Kuo, C-C Jay; El Fakhri, Georges; Kang, Je-Won; Woo, Jonghye.

Proc SPIE Int Soc Opt Eng ; 120322022.

Article En | MEDLINE | ID: mdl-35983176

Unsupervised domain adaptation (UDA) has been widely used to transfer knowledge from a labeled source domain to an unlabeled target domain to counter the difficulty of labeling in a new domain. The training of conventional solutions usually relies on the existence of both source and target domain data. However, privacy of the large-scale and well-labeled data in the source domain and trained model parameters can become the major concern of cross center/domain collaborations. In this work, to address this, we propose a practical solution to UDA for segmentation with a black-box segmentation model trained in the source domain only, rather than original source data or a white-box source model. Specifically, we resort to a knowledge distillation scheme with exponential mixup decay (EMD) to gradually learn target-specific representations. In addition, unsupervised entropy minimization is further applied to regularization of the target domain confidence. We evaluated our framework on the BraTS 2018 database, achieving performance on par with white-box source model adaptation approaches.

8.

Subtype-Aware Dynamic Unsupervised Domain Adaptation.

Liu, Xiaofeng; Xing, Fangxu; You, Jane; Lu, Jun; Kuo, C-C Jay; Fakhri, Georges El; Woo, Jonghye.

IEEE Trans Neural Netw Learn Syst ; PP2022 Jul 27.

Article En | MEDLINE | ID: mdl-35895653

Unsupervised domain adaptation (UDA) has been successfully applied to transfer knowledge from a labeled source domain to target domains without their labels. Recently introduced transferable prototypical networks (TPNs) further address class-wise conditional alignment. In TPN, while the closeness of class centers between source and target domains is explicitly enforced in a latent space, the underlying fine-grained subtype structure and the cross-domain within-class compactness have not been fully investigated. To counter this, we propose a new approach to adaptively perform a fine-grained subtype-aware alignment to improve the performance in the target domain without the subtype label in both domains. The insight of our approach is that the unlabeled subtypes in a class have the local proximity within a subtype while exhibiting disparate characteristics because of different conditional and label shifts. Specifically, we propose to simultaneously enforce subtype-wise compactness and class-wise separation, by utilizing intermediate pseudo-labels. In addition, we systematically investigate various scenarios with and without prior knowledge of subtype numbers and propose to exploit the underlying subtype structure. Furthermore, a dynamic queue framework is developed to evolve the subtype cluster centroids steadily using an alternative processing scheme. Experimental results, carried out with multiview congenital heart disease data and VisDA and DomainNet, show the effectiveness and validity of our subtype-aware UDA, compared with state-of-the-art UDA methods.

9.

Geometrical Interpretation and Design of Multilayer Perceptrons.

Lin, Ruiyuan; Zhou, Zhiruo; You, Suya; Rao, Raghuveer; Kuo, C-C Jay.

IEEE Trans Neural Netw Learn Syst ; PP2022 Jul 21.

Article En | MEDLINE | ID: mdl-35862331

The multilayer perceptron (MLP) neural network is interpreted from the geometrical viewpoint in this work, that is, an MLP partition an input feature space into multiple nonoverlapping subspaces using a set of hyperplanes, where the great majority of samples in a subspace belongs to one object class. Based on this high-level idea, we propose a three-layer feedforward MLP (FF-MLP) architecture for its implementation. In the first layer, the input feature space is split into multiple subspaces by a set of partitioning hyperplanes and rectified linear unit (ReLU) activation, which is implemented by the classical two-class linear discriminant analysis (LDA). In the second layer, each neuron activates one of the subspaces formed by the partitioning hyperplanes with specially designed weights. In the third layer, all subspaces of the same class are connected to an output node that represents the object class. The proposed design determines all MLP parameters in a feedforward one-pass fashion analytically without backpropagation. Experiments are conducted to compare the performance of the traditional backpropagation-based MLP (BP-MLP) and the new FF-MLP. It is observed that the FF-MLP outperforms the BP-MLP in terms of design time, training time, and classification performance in several benchmarking datasets. Our source code is available at https://colab.research.google.com/drive/1Gz0L8A-nT4ijrUchrhEXXsnaacrFdenn?usp = sharing.

10.

Unsupervised Black-Box Model Domain Adaptation for Brain Tumor Segmentation.

Liu, Xiaofeng; Yoo, Chaehwa; Xing, Fangxu; Kuo, C-C Jay; El Fakhri, Georges; Kang, Je-Won; Woo, Jonghye.

Front Neurosci ; 16: 837646, 2022.

Article En | MEDLINE | ID: mdl-35720708

Unsupervised domain adaptation (UDA) is an emerging technique that enables the transfer of domain knowledge learned from a labeled source domain to unlabeled target domains, providing a way of coping with the difficulty of labeling in new domains. The majority of prior work has relied on both source and target domain data for adaptation. However, because of privacy concerns about potential leaks in sensitive information contained in patient data, it is often challenging to share the data and labels in the source domain and trained model parameters in cross-center collaborations. To address this issue, we propose a practical framework for UDA with a black-box segmentation model trained in the source domain only, without relying on source data or a white-box source model in which the network parameters are accessible. In particular, we propose a knowledge distillation scheme to gradually learn target-specific representations. Additionally, we regularize the confidence of the labels in the target domain via unsupervised entropy minimization, leading to performance gain over UDA without entropy minimization. We extensively validated our framework on a few datasets and deep learning backbones, demonstrating the potential for our framework to be applied in challenging yet realistic clinical settings.

11.

R-PointHop: A Green, Accurate, and Unsupervised Point Cloud Registration Method.

Kadam, Pranav; Zhang, Min; Liu, Shan; Kuo, C-C Jay.

IEEE Trans Image Process ; 31: 2710-2725, 2022.

Article En | MEDLINE | ID: mdl-35324441

Inspired by the recent PointHop classification method, an unsupervised 3D point cloud registration method, called R-PointHop, is proposed in this work. R-PointHop first determines a local reference frame (LRF) for every point using its nearest neighbors and finds local attributes. Next, R-PointHop obtains local-to-global hierarchical features by point downsampling, neighborhood expansion, attribute construction and dimensionality reduction steps. Thus, point correspondences are built in hierarchical feature space using the nearest neighbor rule. Afterwards, a subset of salient points with good correspondence is selected to estimate the 3D transformation. The use of the LRF allows for invariance of the hierarchical features of points with respect to rotation and translation, thus making R-PointHop more robust at building point correspondence, even when the rotation angles are large. Experiments are conducted on the 3DMatch, ModelNet40, and Stanford Bunny datasets, which demonstrate the effectiveness of R-PointHop for 3D point cloud registration. R-PointHop's model size and training time are an order of magnitude smaller than those of deep learning methods, and its registration errors are smaller, making it a green and accurate solution. Our codes are available on GitHub (https://github.com/pranavkdm/R-PointHop).

12.

Brain MR Atlas Construction Using Symmetric Deep Neural Inpainting.

Xing, Fangxu; Liu, Xiaofeng; Kuo, C-C Jay; Fakhri, Georges El; Woo, Jonghye.

IEEE J Biomed Health Inform ; 26(7): 3185-3196, 2022 07.

Article En | MEDLINE | ID: mdl-35139030

Modeling statistical properties of anatomical structures using magnetic resonance imaging is essential for revealing common information of a target population and unique properties of specific subjects. In brain imaging, a statistical brain atlas is often constructed using a number of healthy subjects. When tumors are present, however, it is difficult to either provide a common space for various subjects or align their imaging data due to the unpredictable distribution of lesions. Here we propose a deep learning-based image inpainting method to replace the tumor regions with normal tissue intensities using only a patient population. Our framework has three major innovations: 1) incompletely distributed datasets with random tumor locations can be used for training; 2) irregularly-shaped tumor regions are properly learned, identified, and corrected; and 3) a symmetry constraint between the two brain hemispheres is applied to regularize inpainted regions. Henceforth, regular atlas construction and image registration methods can be applied using inpainted data to obtain tissue deformation, thereby achieving group-specific statistical atlases and patient-to-atlas registration. Our framework was tested using the public database from the Multimodal Brain Tumor Segmentation challenge. Results showed increased similarity scores as well as reduced reconstruction errors compared with three existing image inpainting methods. Patient-to-atlas registration also yielded better results with improved normalized cross-correlation and mutual information and a reduced amount of deformation over the tumor regions.

Image Processing, Computer-Assisted , Magnetic Resonance Imaging , Brain/diagnostic imaging , Humans , Image Processing, Computer-Assisted/methods , Magnetic Resonance Imaging/methods

13.

BERTHop: An Effective Vision-and-Language Model for Chest X-ray Disease Diagnosis.

Monajatipoor, Masoud; Rouhsedaghat, Mozhdeh; Li, Liunian Harold; Kuo, C-C Jay; Chien, Aichi; Chang, Kai-Wei.

Med Image Comput Comput Assist Interv ; 13435: 725-734, 2022 Sep.

Article En | MEDLINE | ID: mdl-37093922

Vision-and-language (V&L) models take image and text as input and learn to capture the associations between them. These models can potentially deal with the tasks that involve understanding medical images along with their associated text. However, applying V&L models in the medical domain is challenging due to the expensiveness of data annotations and the requirements of domain knowledge. In this paper, we identify that the visual representation in general V&L models is not suitable for processing medical data. To overcome this limitation, we propose BERTHop, a transformer-based model based on PixelHop++ and VisualBERT for better capturing the associations between clinical notes and medical images. Experiments on the OpenI dataset, a commonly used thoracic disease diagnosis benchmark, show that BERTHop achieves an average Area Under the Curve (AUC) of 98.12% which is 1.62% higher than state-of-the-art while it is trained on a 9× smaller dataset.

14.

Mutual Information Regularized Feature-Level Frankenstein for Discriminative Recognition.

Liu, Xiaofeng; Yang, Chao; You, Jane; Kuo, C-C Jay; Kumar, B V K Vijaya.

IEEE Trans Pattern Anal Mach Intell ; 44(9): 5243-5260, 2022 09.

Article En | MEDLINE | ID: mdl-33945470

Deep learning recognition approaches can potentially perform better if we can extract a discriminative representation that controllably separates nuisance factors. In this paper, we propose a novel approach to explicitly enforce the extracted discriminative representation d, extracted latent variation l (e,g., background, unlabeled nuisance attributes), and semantic variation label vector s (e.g., labeled expressions/pose) to be independent and complementary to each other. We can cast this problem as an adversarial game in the latent space of an auto-encoder. Specifically, with the to-be-disentangled s, we propose to equip an end-to-end conditional adversarial network with the ability to decompose an input sample into d and l. However, we argue that maximizing the cross-entropy loss of semantic variation prediction from d is not sufficient to remove the impact of s from d, and that the uniform-target and entropy regularization are necessary. A collaborative mutual information regularization framework is further proposed to avoid unstable adversarial training. It is able to minimize the differentiable mutual information between the variables to enforce independence. The proposed discriminative representation inherits the desired tolerance property guided by prior knowledge of the task. Our proposed framework achieves top performance on diverse recognition tasks, including digits classification, large-scale face recognition on LFW and IJB-A datasets, and face recognition tolerant to changes in lighting, makeup, disguise, etc.

Facial Recognition , Pattern Recognition, Automated , Algorithms , Lighting

15.

ACT: Semi-supervised Domain-adaptive Medical Image Segmentation with Asymmetric Co-Training.

Liu, Xiaofeng; Xing, Fangxu; Shusharina, Nadya; Lim, Ruth; Kuo, C-C Jay; El Fakhri, Georges; Woo, Jonghye.

Med Image Comput Comput Assist Interv ; 13435: 66-76, 2022 Sep.

Article En | MEDLINE | ID: mdl-36780245

Unsupervised domain adaptation (UDA) has been vastly explored to alleviate domain shifts between source and target domains, by applying a well-performed model in an unlabeled target domain via supervision of a labeled source domain. Recent literature, however, has indicated that the performance is still far from satisfactory in the presence of significant domain shifts. Nonetheless, delineating a few target samples is usually manageable and particularly worthwhile, due to the substantial performance gain. Inspired by this, we aim to develop semi-supervised domain adaptation (SSDA) for medical image segmentation, which is largely underexplored. We, thus, propose to exploit both labeled source and target domain data, in addition to unlabeled target data in a unified manner. Specifically, we present a novel asymmetric co-training (ACT) framework to integrate these subsets and avoid the domination of the source domain data. Following a divide-and-conquer strategy, we explicitly decouple the label supervisions in SSDA into two asymmetric sub-tasks, including semi-supervised learning (SSL) and UDA, and leverage different knowledge from two segmentors to take into account the distinction between the source and target label supervisions. The knowledge learned in the two modules is then adaptively integrated with ACT, by iteratively teaching each other, based on the confidence-aware pseudo-label. In addition, pseudo label noise is well-controlled with an exponential MixUp decay scheme for smooth propagation. Experiments on cross-modality brain tumor MRI segmentation tasks using the BraTS18 database showed, even with limited labeled target samples, ACT yielded marked improvements over UDA and state-of-the-art SSDA methods and approached an "upper bound" of supervised joint training.

16.

Segmentation of Cardiac Structures via Successive Subspace Learning with Saab Transform from Cine MRI.

Liu, Xiaofeng; Xing, Fangxu; Gaggin, Hanna K; Wang, Weichung; Kuo, C-C Jay; El Fakhri, Georges; Woo, Jonghye.

Annu Int Conf IEEE Eng Med Biol Soc ; 2021: 3535-3538, 2021 11.

Article En | MEDLINE | ID: mdl-34892002

Assessment of cardiovascular disease (CVD) with cine magnetic resonance imaging (MRI) has been used to non-invasively evaluate detailed cardiac structure and function. Accurate segmentation of cardiac structures from cine MRI is a crucial step for early diagnosis and prognosis of CVD, and has been greatly improved with convolutional neural networks (CNN). There, however, are a number of limitations identified in CNN models, such as limited interpretability and high complexity, thus limiting their use in clinical practice. In this work, to address the limitations, we propose a lightweight and interpretable machine learning model, successive subspace learning with the subspace approximation with adjusted bias (Saab) transform, for accurate and efficient segmentation from cine MRI. Specifically, our segmentation framework is comprised of the following steps: (1) sequential expansion of near-to-far neighborhood at different resolutions; (2) channel-wise subspace approximation using the Saab transform for unsupervised dimension reduction; (3) class-wise entropy guided feature selection for supervised dimension reduction; (4) concatenation of features and pixel-wise classification with gradient boost; and (5) conditional random field for post-processing. Experimental results on the ACDC 2017 segmentation database, showed that our framework performed better than state-of-the-art U-Net models with 200× fewer parameters in delineating the left ventricle, right ventricle, and myocardium, thus showing its potential to be used in clinical practice.Clinical relevance- Delineation of the left ventricular cavity, myocardium, and right ventricle from cardiac MR images is a common clinical task to establish diagnosis and prognosis of CVD.

Image Processing, Computer-Assisted , Magnetic Resonance Imaging, Cine , Heart/diagnostic imaging , Heart Ventricles/diagnostic imaging , Neural Networks, Computer

17.

Shape-Preserving Stereo Object Remapping via Object-Consistent Grid Warping.

Li, Bing; Lin, Chia-Wen; Zheng, Cheng; Liu, Shan; Ghanem, Bernard; Gao, Wen; Kuo, C-C Jay.

IEEE Trans Image Process ; 30: 5889-5904, 2021.

Article En | MEDLINE | ID: mdl-34156942

Viewing various stereo images under different viewing conditions has escalated the need for effective object-level remapping techniques. In this paper, we propose a new object spatial mapping scheme, which adjusts the depth and size of the selected object to match user preference and viewing conditions. Existing warping-based methods often distort the shape of important objects or cannot faithfully adjust the depth/size of the selected object due to improper warping such as local rotations. In this paper, by explicitly reducing the transformation freedom degree of warping, we propose an optimization model based on axis-aligned warping for object spatial remapping. The proposed axis-aligned warping based optimization model can simultaneously adjust the depths and sizes of selected objects to their target values without introducing severe shape distortions. Moreover, we propose object consistency constraints to ensure the size/shape of parts inside a selected object to be consistently adjusted. Such constraints improve the size/shape adjustment performance while remaining robust to some extent to incomplete object extraction. Experimental results demonstrate that the proposed method achieves high flexibility and effectiveness in adjusting the size and depth of objects compared with existing methods.

18.

VMAF Oriented Perceptual Coding Based on Piecewise Metric Coupling.

Luo, Zhengyi; Zhu, Chen; Huang, Yan; Xie, Rong; Song, Li; Kuo, C-C Jay.

IEEE Trans Image Process ; 30: 5109-5121, 2021.

Article En | MEDLINE | ID: mdl-33989154

It has been recognized that videos have to be encoded in a rate-distortion optimized manner for high coding performance. Therefore, operational coding methods have been developed for conventional distortion metrics such as Sum of Squared Error (SSE). Nowadays, with the rapid development of machine learning, the state-of-the-art learning based metric Video Multimethod Assessment Fusion (VMAF) has been proven to outperform conventional ones in terms of the correlation with human perception, and thus deserves integration into the coding framework. However, unlike conventional metrics, VMAF has no specific computational formulas and may be frequently updated by new training data, which invalidates the existing coding methods and makes it highly desired to develop a rate-distortion optimized method for VMAF. Moreover, VMAF is designed to operate at the frame level, which leads to further difficulties in its application to today's block based coding. In this paper, we propose a VMAF oriented perceptual coding method based on piecewise metric coupling. Firstly, we explore the correlation between VMAF and SSE in the neighborhood of a benchmark distortion. Then a rate-distortion optimization model is formulated based on the correlation, and an optimized block based coding method is presented for VMAF. Experimental results show that 3.61% and 2.67% bit saving on average can be achieved for VMAF under the low_delay_p and the random_access_main configurations of HEVC coding respectively.

19.

Saak Transform-Based Machine Learning for Light-Sheet Imaging of Cardiac Trabeculation.

Ding, Yichen; Gudapati, Varun; Lin, Ruiyuan; Fei, Yanan; Packard, Rene R Sevag; Song, Sibo; Chang, Chih-Chiang; Baek, Kyung In; Wang, Zhaoqiang; Roustaei, Mehrdad; Kuang, Dengfeng; Kuo, C-C Jay; Hsiai, Tzung K.

IEEE Trans Biomed Eng ; 68(1): 225-235, 2021 01.

Article En | MEDLINE | ID: mdl-32365015

OBJECTIVE: Recent advances in light-sheet fluorescence microscopy (LSFM) enable 3-dimensional (3-D) imaging of cardiac architecture and mechanics in toto. However, segmentation of the cardiac trabecular network to quantify cardiac injury remains a challenge. METHODS: We hereby employed "subspace approximation with augmented kernels (Saak) transform" for accurate and efficient quantification of the light-sheet image stacks following chemotherapy-treatment. We established a machine learning framework with augmented kernels based on the Karhunen-Loeve Transform (KLT) to preserve linearity and reversibility of rectification. RESULTS: The Saak transform-based machine learning enhances computational efficiency and obviates iterative optimization of cost function needed for neural networks, minimizing the number of training datasets for segmentation in our scenario. The integration of forward and inverse Saak transforms can also serve as a light-weight module to filter adversarial perturbations and reconstruct estimated images, salvaging robustness of existing classification methods. The accuracy and robustness of the Saak transform are evident following the tests of dice similarity coefficients and various adversary perturbation algorithms, respectively. The addition of edge detection further allows for quantifying the surface area to volume ratio (SVR) of the myocardium in response to chemotherapy-induced cardiac remodeling. CONCLUSION: The combination of Saak transform, random forest, and edge detection augments segmentation efficiency by 20-fold as compared to manual processing. SIGNIFICANCE: This new methodology establishes a robust framework for post light-sheet imaging processing, and creating a data-driven machine learning for automated quantification of cardiac ultra-structure.

Machine Learning , Neural Networks, Computer , Algorithms , Heart/diagnostic imaging , Image Processing, Computer-Assisted , Microscopy, Fluorescence

20.

Image Coding with Data-Driven Transforms: Methodology, Performance and Potential.

Zhang, Xinfeng; Yang, Chao; Li, Xiaoguang; Liu, Shan; Yang, Haitao; Katsavounidis, Ioannis; Lei, Shaw-Min; Kuo, C-C Jay.

IEEE Trans Image Process ; PP2020 Sep 24.

Article En | MEDLINE | ID: mdl-32970598

Image compression has always been an important topic in the last decades due to the explosive increase of images. The popular image compression formats are based on different transforms which convert images from the spatial domain into compact frequency domain to remove the spatial correlation. In this paper, we focus on the exploration of data-driven transform, Karhunen-Loéve transform (KLT), the kernels of which are derived from specific images via Principal Component Analysis (PCA), and design a high efficient KLT based image compression algorithm with variable transform sizes. To explore the optimal compression performance, the multiple transform sizes and categories are utilized and determined adaptively according to their rate-distortion (RD) costs. Moreover, comprehensive analyses on the transform coefficients are provided and a band-adaptive quantization scheme is proposed based on the coefficient RD performance. Extensive experiments are performed on several class-specific images as well as general images, and the proposed method achieves significant coding gain over the popular image compression standards including JPEG, JPEG 2000, and the state-of-the-art dictionary learning based methods.