RESUMO
The Cox proportional hazards model is a popular semi-parametric model for survival analysis. In this paper, we aim at developing a federated algorithm for the Cox proportional hazards model over vertically partitioned data (i.e., data from the same patient are stored at different institutions). We propose a novel algorithm, namely VERTICOX, to obtain the global model parameters in a distributed fashion based on the Alternating Direction Method of Multipliers (ADMM) framework. The proposed model computes intermediary statistics and exchanges them to calculate the global model without collecting individual patient-level data. We demonstrate that our algorithm achieves equivalent accuracy for the estimation of model parameters and statistics to that of its centralized realization. The proposed algorithm converges linearly under the ADMM framework. Its computational complexity and communication costs are polynomially and linearly associated with the number of subjects, respectively. Experimental results show that VERTICOX can achieve accurate model parameter estimation to support federated survival analysis over vertically distributed data by saving bandwidth and avoiding exchange of information about individual patients. The source code for VERTICOX is available at: https://github.com/daiwenrui/VERTICOX.
RESUMO
MOTIVATION: Genome-wide association studies (GWAS) have been widely used in discovering the association between genotypes and phenotypes. Human genome data contain valuable but highly sensitive information. Unprotected disclosure of such information might put individual's privacy at risk. It is important to protect human genome data. Exact logistic regression is a bias-reduction method based on a penalized likelihood to discover rare variants that are associated with disease susceptibility. We propose the HEALER framework to facilitate secure rare variants analysis with a small sample size. RESULTS: We target at the algorithm design aiming at reducing the computational and storage costs to learn a homomorphic exact logistic regression model (i.e. evaluate P-values of coefficients), where the circuit depth is proportional to the logarithmic scale of data size. We evaluate the algorithm performance using rare Kawasaki Disease datasets. AVAILABILITY AND IMPLEMENTATION: Download HEALER at http://research.ucsd-dbmi.org/HEALER/ CONTACT: shw070@ucsd.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Algoritmos , Privacidade Genética , Variação Genética , Estudo de Associação Genômica Ampla , Doenças Raras/genética , Genoma Humano , Humanos , Síndrome de Linfonodos Mucocutâneos/genéticaRESUMO
BACKGROUND: Endobronchial ultrasound (EBUS) elastography is a new imaging procedure for describing the elasticity of intrathoracic lesions and providing important additional diagnostic information. OBJECTIVES: The aim of this study was to utilize the feasibility of qualitative and quantitative methods to evaluate the ability of EBUS elastography to differentiate between benign and malignant mediastinal and hilar lymph nodes (LNs) during EBUS-guided transbronchial needle aspiration (EBUS-TBNA). METHODS: Patients with enlarged intrathoracic LNs required for EBUS-TBNA examination at a clinical center for thoracic medicine from January 2014 to April 2014 were prospectively enrolled. EBUS sonographic characteristics on B-mode, vascular patterns and elastography, EBUS-TBNA procedures, pathological findings, and microbiological results were recorded. Furthermore, elastographic patterns (qualitative method) and the mean gray value inside the region of interest (quantitative method) were analyzed. Both methods were compared with a definitive diagnosis of the involved LNs. RESULTS: Fifty-six patients including 68 LNs (33 benign and 35 malignant nodes) were prospectively enrolled into this study and retrospectively analyzed. Using qualitative and quantitative methods, we were able to differentiate between benign and malignant LNs with high sensitivity, specificity, positive and negative predictive values, and accuracy (85.71, 81.82, 83.33, 84.38, and 83.82% vs. 91.43, 72.73, 78.05, 88.89, and 82.35%, respectively). CONCLUSIONS: EBUS elastography is potentially capable of further differentiating between benign and malignant LNs. These proposed qualitative and quantitative methods might be useful tools for describing EBUS elastography during EBUS-TBNA.
Assuntos
Broncoscopia/métodos , Técnicas de Imagem por Elasticidade , Linfonodos/patologia , Ultrassonografia de Intervenção/métodos , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Variações Dependentes do Observador , Projetos Piloto , Estudos Retrospectivos , TóraxRESUMO
BACKGROUND: The increasing availability of genome data motivates massive research studies in personalized treatment and precision medicine. Public cloud services provide a flexible way to mitigate the storage and computation burden in conducting genome-wide association studies (GWAS). However, data privacy has been widely concerned when sharing the sensitive information in a cloud environment. METHODS: We presented a novel framework (FORESEE: Fully Outsourced secuRe gEnome Study basEd on homomorphic Encryption) to fully outsource GWAS (i.e., chi-square statistic computation) using homomorphic encryption. The proposed framework enables secure divisions over encrypted data. We introduced two division protocols (i.e., secure errorless division and secure approximation division) with a trade-off between complexity and accuracy in computing chi-square statistics. RESULTS: The proposed framework was evaluated for the task of chi-square statistic computation with two case-control datasets from the 2015 iDASH genome privacy protection challenge. Experimental results show that the performance of FORESEE can be significantly improved through algorithmic optimization and parallel computation. Remarkably, the secure approximation division provides significant performance gain, but without missing any significance SNPs in the chi-square association test using the aforementioned datasets. CONCLUSIONS: Unlike many existing HME based studies, in which final results need to be computed by the data owner due to the lack of the secure division operation, the proposed FORESEE framework support complete outsourcing to the cloud and output the final encrypted chi-square statistics.
Assuntos
Computação em Nuvem/normas , Segurança Computacional/normas , Privacidade Genética/normas , Estudo de Associação Genômica Ampla/normas , Humanos , Serviços Terceirizados/normasRESUMO
3-D point clouds facilitate 3-D visual applications with detailed information of objects and scenes but bring about enormous challenges to design efficient compression technologies. The irregular signal statistics and high-order geometric structures of 3-D point clouds cannot be fully exploited by existing sparse representation and deep learning based point cloud attribute compression schemes and graph dictionary learning paradigms. In this paper, we propose a novel p-Laplacian embedding graph dictionary learning framework that jointly exploits the varying signal statistics and high-order geometric structures for 3-D point cloud attribute compression. The proposed framework formulates a nonconvex minimization constrained by p-Laplacian embedding regularization to learn a graph dictionary varying smoothly along the high-order geometric structures. An efficient alternating optimization paradigm is developed by harnessing ADMM to solve the nonconvex minimization. To our best knowledge, this paper proposes the first graph dictionary learning framework for point cloud compression. Furthermore, we devise an efficient layered compression scheme that integrates the proposed framework to exploit the correlations of 3-D point clouds in a structured fashion. Experimental results demonstrate that the proposed framework is superior to state-of-the-art transform-based methods in M-term approximation and point cloud attribute compression and outperforms recent MPEG G-PCC reference software.
RESUMO
By introducing randomness on the environments, domain randomization (DR) imposes diversity to the policy training of deep reinforcement learning, and thus improves its capability of generalization. The randomization of environments, however, introduces another source of variability for the estimate of policy gradients, in addition to the already high variance incurred by trajectory sampling. Therefore, with standard state-dependent baselines, the policy gradient methods may still suffer high variance, causing a low sample efficiency during the training of DR. In this paper, we theoretically derive a bias-free and state/environment-dependent optimal baseline for DR, and analytically show its ability to achieve further variance reduction over the standard constant and state-dependent baselines for DR. Based on our theory, we further propose a variance reduced domain randomization (VRDR) approach for policy gradient methods, to strike a tradeoff between the variance reduction and computational complexity for the practical implementation. By dividing the entire space of environments into some subspaces and then estimating the state/subspace-dependent baseline, VRDR enjoys a theoretical guarantee of variance reduction and faster convergence than the state-dependent baselines. Empirical evaluations on six robot control tasks with randomized dynamics demonstrate that VRDR not only accelerates the convergence of policy training, but can consistently achieve a better eventual policy with improved training stability.
RESUMO
Federated learning (FL) commonly encourages the clients to perform multiple local updates before the global aggregation, thus avoiding frequent model exchanges and relieving the communication bottleneck between the server and clients. Though empirically effective, the negative impact of multiple local updates on the stability of FL is not thoroughly studied, which may result in a globally unstable and slow convergence. Based on sensitivity analysis, we define in this paper a local-update stability index for the general FL, as measured by the maximum inter-client model discrepancy after the multiple local updates that mainly stems from the data heterogeneity. It enables to determine how much the variation of client's models with multiple local updates may influence the global model, and can also be linked with the convergence and generalization. We theoretically derive the proposed local-update stability for current state-of-the-art FL methods, providing possible insight to understanding their motivation and limitation from a new perspective of stability. For example, naively executing the parallel acceleration locally at clients would harm the local-update stability. Motivated by this, we then propose a novel accelerated yet stabilized FL algorithm (named FedANAG) based on the server- and client-level Nesterov accelerated gradient (NAG). In FedANAG, the global and local momenta are elaborately designed and alternatively updated, while the stability of local update is enhanced with help of the global momentum. We prove the convergence of FedANAG for strongly convex, general convex and non-convex settings. We then conduct evaluations on both the synthetic and real-world datasets to first validate our proposed local-update stability. The results further show that across various data heterogeneity and client participation ratios, FedANAG not only accelerates the global convergence by reducing the required number of communication rounds to a target accuracy, but converges to an eventually higher accuracy.
RESUMO
Object pose estimation constitutes a critical area within the domain of 3D vision. While contemporary state-of-the-art methods that leverage real-world pose annotations have demonstrated commendable performance, the procurement of such real training data incurs substantial costs. This paper focuses on a specific setting wherein only 3D CAD models are utilized as a priori knowledge, devoid of any background or clutter information. We introduce a novel method, CPPF++, designed for sim-to-real category-level pose estimation. This method builds upon the foundational point-pair voting scheme of CPPF, reformulating it through a probabilistic view. To address the challenge posed by vote collision, we propose a novel approach that involves modeling the voting uncertainty by estimating the probabilistic distribution of each point pair within the canonical space. Furthermore, we augment the contextual information provided by each voting unit through the introduction of N-point tuples. To enhance the robustness and accuracy of the model, we incorporate several innovative modules, including noisy pair filtering, online alignment optimization, and a tuple feature ensemble. Alongside these methodological advancements, we introduce a new category-level pose estimation dataset, named DiversePose 300. Empirical evidence demonstrates that our method significantly surpasses previous sim-to-real approaches and achieves comparable or superior performance on novel datasets. Our code is available on https://github.com/qq456cvb/CPPF2.
RESUMO
Source-free domain adaptation (SFDA) shows the potential to improve the generalizability of deep learning-based face anti-spoofing (FAS) while preserving the privacy and security of sensitive human faces. However, existing SFDA methods are significantly degraded without accessing source data due to the inability to mitigate domain and identity bias in FAS. In this paper, we propose a novel Source-free Domain Adaptation framework for FAS (SDA-FAS) that systematically addresses the challenges of source model pre-training, source knowledge adaptation, and target data exploration under the source-free setting. Specifically, we develop a generalized method for source model pre-training that leverages a causality-inspired PatchMix data augmentation to diminish domain bias and designs the patch-wise contrastive loss to alleviate identity bias. For source knowledge adaptation, we propose a contrastive domain alignment module to align conditional distribution across domains with a theoretical equivalence to adaptation based on source data. Furthermore, target data exploration is achieved via self-supervised learning with patch shuffle augmentation to identify unseen attack types, which is ignored in existing SFDA methods. To our best knowledge, this paper provides the first full-stack privacy-preserving framework to address the generalization problem in FAS. Extensive experiments on nineteen cross-dataset scenarios show our framework considerably outperforms state-of-the-art methods.
RESUMO
This paper proposes a novel model on intra coding for High Efficiency Video Coding (HEVC), which simultaneously predicts blocks of pixels with optimal rate distortion. It utilizes the spatial statistical correlation for the optimal prediction based on 2-D contexts, in addition to formulating the data-driven structural interdependences to make the prediction error coherent with the probability distribution, which is desirable for successful transform and coding. The structured set prediction model incorporates a max-margin Markov network (M3N) to regulate and optimize multiple block predictions. The model parameters are learned by discriminating the actual pixel value from other possible estimates to maximize the margin (i.e., decision boundary bandwidth). Compared to existing methods that focus on minimizing prediction error, the M3N-based model adaptively maintains the coherence for a set of predictions. Specifically, the proposed model concurrently optimizes a set of predictions by associating the loss for individual blocks to the joint distribution of succeeding discrete cosine transform coefficients. When the sample size grows, the prediction error is asymptotically upper bounded by the training error under the decomposable loss function. As an internal step, we optimize the underlying Markov network structure to find states that achieve the maximal energy using expectation propagation. For validation, we integrate the proposed model into HEVC for optimal mode selection on rate-distortion optimization. The proposed prediction model obtains up to 2.85% bit rate reduction and achieves better visual quality in comparison to the HEVC intra coding.
RESUMO
This paper explores the problem of reconstructing high-resolution light field (LF) images from hybrid lenses, including a high-resolution camera surrounded by multiple low-resolution cameras. The performance of existing methods is still limited, as they produce either blurry results on plain textured areas or distortions around depth discontinuous boundaries. To tackle this challenge, we propose a novel end-to-end learning-based approach, which can comprehensively utilize the specific characteristics of the input from two complementary and parallel perspectives. Specifically, one module regresses a spatially consistent intermediate estimation by learning a deep multidimensional and cross-domain feature representation, while the other module warps another intermediate estimation, which maintains the high-frequency textures, by propagating the information of the high-resolution view. We finally leverage the advantages of the two intermediate estimations adaptively via the learned confidence maps, leading to the final high-resolution LF image with satisfactory results on both plain textured areas and depth discontinuous boundaries. Besides, to promote the effectiveness of our method trained with simulated hybrid data on real hybrid data captured by a hybrid LF imaging system, we carefully design the network architecture and the training strategy. Extensive experiments on both real and simulated hybrid data demonstrate the significant superiority of our approach over state-of-the-art ones. To the best of our knowledge, this is the first end-to-end deep learning method for LF reconstruction from a real hybrid input. We believe our framework could potentially decrease the cost of high-resolution LF data acquisition and benefit LF data storage and transmission. The code will be publicly available at https://github.com/jingjin25/LFhybridSR-Fusion.
RESUMO
It is promising to solve linear inverse problems by unfolding iterative algorithms (e.g., iterative shrinkage thresholding algorithm (ISTA)) as deep neural networks (DNNs) with learnable parameters. However, existing ISTA-based unfolded algorithms restrict the network architectures for iterative updates with the partial weight coupling structure to guarantee convergence. In this paper, we propose hybrid ISTA to unfold ISTA with both pre-computed and learned parameters by incorporating free-form DNNs (i.e., DNNs with arbitrary feasible and reasonable network architectures), while ensuring theoretical convergence. We first develop HCISTA to improve the efficiency and flexibility of classical ISTA (with pre-computed parameters) without compromising the convergence rate in theory. Furthermore, the DNN-based hybrid algorithm is generalized to popular variants of learned ISTA, dubbed HLISTA, to enable a free architecture of learned parameters with a guarantee of linear convergence. To our best knowledge, this paper is the first to provide a convergence-provable framework that enables free-form DNNs in ISTA-based unfolded algorithms. This framework is general to endow arbitrary DNNs for solving linear inverse problems with convergence guarantees. Extensive experiments demonstrate that hybrid ISTA can reduce the reconstruction error with an improved convergence rate in the tasks of sparse recovery and compressive sensing.
RESUMO
Endobronchial ultrasound (EBUS) elastography videos have shown great potential to supplement intrathoracic lymph node diagnosis. However, it is laborious and subjective for the specialists to select the representative frames from the tedious videos and make a diagnosis, and there lacks a framework for automatic representative frame selection and diagnosis. To this end, we propose a novel deep learning framework that achieves reliable diagnosis by explicitly selecting sparse representative frames and guaranteeing the invariance of diagnostic results to the permutations of video frames. Specifically, we develop a differentiable sparse graph attention mechanism that jointly considers frame-level features and the interactions across frames to select sparse representative frames and exclude disturbed frames. Furthermore, instead of adopting deep learning-based frame-level features, we introduce the normalized color histogram that considers the domain knowledge of EBUS elastography images and achieves superior performance. To our best knowledge, the proposed framework is the first to simultaneously achieve automatic representative frame selection and diagnosis with EBUS elastography videos. Experimental results demonstrate that it achieves an average accuracy of 81.29% and area under the receiver operating characteristic curve (AUC) of 0.8749 on the collected dataset of 727 EBUS elastography videos, which is comparable to the performance of the expert-based clinical methods based on manually-selected representative frames.
Assuntos
Técnicas de Imagem por Elasticidade , Humanos , Técnicas de Imagem por Elasticidade/métodos , Tórax , Linfonodos/diagnóstico por imagem , Linfonodos/patologia , Curva ROC , Endossonografia/métodosRESUMO
Self-supervised learning based on instance discrimination has shown remarkable progress. In particular, contrastive learning, which regards each image as well as its augmentations as an individual class and tries to distinguish them from all other images, has been verified effective for representation learning. However, conventional contrastive learning does not model the relation between semantically similar samples explicitly. In this paper, we propose a general module that considers the semantic similarity among images. This is achieved by expanding the views generated by a single image to Cross-Samples and Multi-Levels, and modeling the invariance to semantically similar images in a hierarchical way. Specifically, the cross-samples are generated by a data mixing operation, which is constrained within samples that are semantically similar, while the multi-level samples are expanded at the intermediate layers of a network. In this way, the contrastive loss is extended to allow for multiple positives per anchor, and explicitly pulling semantically similar images together at different layers of the network. Our method, termed as CSML, has the ability to integrate multi-level representations across samples in a robust way. CSML is applicable to current contrastive based methods and consistently improves the performance. Notably, using MoCo v2 as an instantiation, CSML achieves 76.6% top-1 accuracy with linear evaluation using ResNet-50 as backbone, 66.7% and 75.1% top-1 accuracy with only 1% and 10% labels, respectively. All these numbers set the new state-of-the-art. The code is available at https://github.com/haohang96/CSML.
RESUMO
Batch normalization (BN) is a fundamental unit in modern deep neural networks. However, BN and its variants focus on normalization statistics but neglect the recovery step that uses linear transformation to improve the capacity of fitting complex data distributions. In this paper, we demonstrate that the recovery step can be improved by aggregating the neighborhood of each neuron rather than just considering a single neuron. Specifically, we propose a simple yet effective method named batch normalization with enhanced linear transformation (BNET) to embed spatial contextual information and improve representation ability. BNET can be easily implemented using the depth-wise convolution and seamlessly transplanted into existing architectures with BN. To our best knowledge, BNET is the first attempt to enhance the recovery step for BN. Furthermore, BN is interpreted as a special case of BNET from both spatial and spectral views. Experimental results demonstrate that BNET achieves consistent performance gains based on various backbones in a wide range of visual tasks. Moreover, BNET can accelerate the convergence of network training and enhance spatial information by assigning important neurons with large weights accordingly.
RESUMO
In this paper, we propose the K-Shot Contrastive Learning (KSCL) of visual features by applying multiple augmentations to investigate the sample variations within individual instances. It aims to combine the advantages of inter-instance discrimination by learning discriminative features to distinguish between different instances, as well as intra-instance variations by matching queries against the variants of augmented samples over instances. Particularly, for each instance, it constructs an instance subspace to model the configuration of how the significant factors of variations in K-shot augmentations can be combined to form the variants of augmentations. Given a query, the most relevant variant of instances is then retrieved by projecting the query onto their subspaces to predict the positive instance class. This generalizes the existing contrastive learning that can be viewed as a special one-shot case. An eigenvalue decomposition is performed to configure instance subspaces, and the embedding network can be trained end-to-end through the differentiable subspace configuration. Experiment results demonstrate the proposed K-shot contrastive learning achieves superior performances to the state-of-the-art unsupervised methods.
Assuntos
Algoritmos , AprendizagemRESUMO
Message passing has evolved as an effective tool for designing graph neural networks (GNNs). However, most existing methods for message passing simply sum or average all the neighboring features to update node representations. They are restricted by two problems: 1) lack of interpretability to identify node features significant to the prediction of GNNs and 2) feature overmixing that leads to the oversmoothing issue in capturing long-range dependencies and inability to handle graphs under heterophily or low homophily. In this article, we propose a node-level capsule graph neural network (NCGNN) to address these problems with an improved message passing scheme. Specifically, NCGNN represents nodes as groups of node-level capsules, in which each capsule extracts distinctive features of its corresponding node. For each node-level capsule, a novel dynamic routing procedure is developed to adaptively select appropriate capsules for aggregation from a subgraph identified by the designed graph filter. NCGNN aggregates only the advantageous capsules and restrains irrelevant messages to avoid overmixing features of interacting nodes. Therefore, it can relieve the oversmoothing issue and learn effective node representations over graphs with homophily or heterophily. Furthermore, our proposed message passing scheme is inherently interpretable and exempt from complex post hoc explanations, as the graph filter and the dynamic routing procedure identify a subset of node features that are most significant to the model prediction from the extracted subgraph. Extensive experiments on synthetic as well as real-world graphs demonstrate that NCGNN can well address the oversmoothing issue and produce better node representations for semisupervised node classification. It outperforms the state of the arts under both homophily and heterophily.
RESUMO
With the advent of data science, the analysis of network or graph data has become a very timely research problem. A variety of recent works have been proposed to generalize neural networks to graphs, either from a spectral graph theory or a spatial perspective. The majority of these works, however, focus on adapting the convolution operator to graph representation. At the same time, the pooling operator also plays an important role in distilling multiscale and hierarchical representations, but it has been mostly overlooked so far. In this article, we propose a parameter-free pooling operator, called iPool, that permits to retain the most informative features in arbitrary graphs. With the argument that informative nodes dominantly characterize graph signals, we propose a criterion to evaluate the amount of information of each node given its neighbors and theoretically demonstrate its relationship to neighborhood conditional entropy. This new criterion determines how nodes are selected and coarsened graphs are constructed in the pooling layer. The resulting hierarchical structure yields an effective isomorphism-invariant representation of networked data on arbitrary topologies. The proposed strategy achieves superior or competitive performance in graph classification on a collection of public graph benchmark data sets and superpixel-induced image graph data sets.
RESUMO
Model quantization is essential to deploy deep convolutional neural networks (DCNNs) on resource-constrained devices. In this article, we propose a general bitwidth assignment algorithm based on theoretical analysis for efficient layerwise weight and activation quantization of DCNNs. The proposed algorithm develops a prediction model to explicitly estimate the loss of classification accuracy led by weight quantization with a geometrical approach. Consequently, dynamic programming is adopted to achieve optimal bitwidth assignment on weights based on the estimated error. Furthermore, we optimize bitwidth assignment for activations by considering the signal-to-quantization-noise ratio (SQNR) between weight and activation quantization. The proposed algorithm is general to reveal the tradeoff between classification accuracy and model size for various network architectures. Extensive experiments demonstrate the efficacy of the proposed bitwidth assignment algorithm and the error rate prediction model. Furthermore, the proposed algorithm is shown to be well extended to object detection.
Assuntos
Algoritmos , Redes Neurais de ComputaçãoRESUMO
Photonic neural networks perform brain-inspired computations using photons instead of electrons to achieve substantially improved computing performance. However, existing architectures can only handle data with regular structures but fail to generalize to graph-structured data beyond Euclidean space. Here, we propose the diffractive graph neural network (DGNN), an all-optical graph representation learning architecture based on the diffractive photonic computing units (DPUs) and on-chip optical devices to address this limitation. Specifically, the graph node attributes are encoded into strip optical waveguides, transformed by DPUs, and aggregated by optical couplers to extract their feature representations. DGNN captures complex dependencies among node neighborhoods during the light-speed optical message passing over graph structures. We demonstrate the applications of DGNN for node and graph-level classification tasks with benchmark databases and achieve superior performance. Our work opens up a new direction for designing application-specific integrated photonic circuits for high-efficiency processing large-scale graph data structures using deep learning.