Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Sensors (Basel) ; 23(5)2023 Feb 24.
Artigo em Inglês | MEDLINE | ID: mdl-36904736

RESUMO

We propose a joint super resolution (SR) and frame interpolation framework that can perform both spatial and temporal super resolution. We identify performance variation according to permutation of inputs in video super-resolution and video frame interpolation. We postulate that favorable features extracted from multiple frames should be consistent regardless of input order if the features are optimally complementary for respective frames. With this motivation, we propose a permutation invariant deep architecture that makes use of the multi-frame SR principles by virtue of our order (permutation) invariant network. Specifically, given two adjacent frames, our model employs a permutation invariant convolutional neural network module to extract "complementary" feature representations facilitating both the SR and temporal interpolation tasks. We demonstrate the effectiveness of our end-to-end joint method against various combinations of the competing SR and frame interpolation methods on challenging video datasets, and thereby we verify our hypothesis.

2.
Aging Ment Health ; 26(10): 2054-2061, 2022 10.
Artigo em Inglês | MEDLINE | ID: mdl-34651536

RESUMO

OBJECTIVES: We aimed to find the association of inflammation and respiratory failure with delirium in COVID-19 patients. We compare the inflammatory and arterial blood gas markers between patients with COVID-19 delirium and delirium in other medical disorders. METHODS: This cross-sectional study used the CHART-DEL, a validated research tool, to screen patients for delirium retrospectively from clinical notes. Inflammatory markers C-reactive protein (CRP) and white cell count (WBC), and the partial pressures of oxygen (PO2) and carbon dioxide (PCO2) were compared between patients with COVID-19 delirium and delirium in other medical disorders. RESULTS: In bivariate analysis, CRP (mg/L) was significantly higher in the COVID-19 group, (81.7 ± 80.0 vs. 58.8 ± 87.7, p = 0.04), and WBC (109/L) was significantly lower (7.44 ± 3.42 vs. 9.71 ± 5.45, p = 0.04). The geometric mean of CRP in the COVID-19 group was 140% higher in multiple linear regression (95% CI = 7-439%, p = 0.03) with age and sex as covariates. There were no significant differences in pO2 or pCO2 across groups. CONCLUSION: The association between higher CRP and COVID-19 in patients with delirium may suggest an inflammatory basis for delirium in COVID-19. Our findings may assist clinicians in establishing whether delirium is due to COVID-19, which may improve management and outcomes of infected patients.


Assuntos
COVID-19 , Delírio , Biomarcadores , Proteína C-Reativa/análise , Estudos Transversais , Delírio/diagnóstico , Humanos , Estudos Retrospectivos
3.
Sensors (Basel) ; 19(1)2018 Dec 27.
Artigo em Inglês | MEDLINE | ID: mdl-30591626

RESUMO

This paper presents a depth upsampling method that produces a high-fidelity dense depth map using a high-resolution RGB image and LiDAR sensor data. Our proposed method explicitly handles depth outliers and computes a depth upsampling with confidence information. Our key idea is the self-learning framework, which automatically learns to estimate the reliability of the upsampled depth map without human-labeled annotation. Thereby, our proposed method can produce a clear and high-fidelity dense depth map that preserves the shape of object structures well, which can be favored by subsequent algorithms for follow-up tasks. We qualitatively and quantitatively evaluate our proposed method by comparing other competing methods on the well-known Middlebury 2014 and KITTIbenchmark datasets. We demonstrate that our method generates accurate depth maps with smaller errors favorable against other methods while preserving a larger number of valid points, as we also show that our approach can be seamlessly applied to improve the quality of depth maps from other depth generation algorithms such as stereo matching and further discuss potential applications and limitations. Compared to previous work, our proposed method has similar depth errors on average, while retaining at least 3% more valid depth points.

4.
Sci Adv ; 10(5): eadk4284, 2024 Feb 02.
Artigo em Inglês | MEDLINE | ID: mdl-38306429

RESUMO

The conflict between stiffness and toughness is a fundamental problem in engineering materials design. However, the systematic discovery of microstructured composites with optimal stiffness-toughness trade-offs has never been demonstrated, hindered by the discrepancies between simulation and reality and the lack of data-efficient exploration of the entire Pareto front. We introduce a generalizable pipeline that integrates physical experiments, numerical simulations, and artificial neural networks to address both challenges. Without any prescribed expert knowledge of material design, our approach implements a nested-loop proposal-validation workflow to bridge the simulation-to-reality gap and find microstructured composites that are stiff and tough with high sample efficiency. Further analysis of Pareto-optimal designs allows us to automatically identify existing toughness enhancement mechanisms, which were previously found through trial and error or biomimicry. On a broader scale, our method provides a blueprint for computational design in various research areas beyond solid mechanics, such as polymer chemistry, fluid dynamics, meteorology, and robotics.

5.
Artigo em Inglês | MEDLINE | ID: mdl-37159323

RESUMO

Most deep anomaly detection models are based on learning normality from datasets due to the difficulty of defining abnormality by its diverse and inconsistent nature. Therefore, it has been a common practice to learn normality under the assumption that anomalous data are absent in a training dataset, which we call normality assumption. However, in practice, the normality assumption is often violated due to the nature of real data distributions that includes anomalous tails, i.e., a contaminated dataset. Thereby, the gap between the assumption and actual training data affects detrimentally in learning of an anomaly detection model. In this work, we propose a learning framework to reduce this gap and achieve better normality representation. Our key idea is to identify sample-wise normality and utilize it as an importance weight, which is updated iteratively during the training. Our framework is designed to be model-agnostic and hyperparameter insensitive so that it applies to a wide range of existing methods without careful parameter tuning. We apply our framework to three different representative approaches of deep anomaly detection that are classified into one-class classification-, probabilistic model-, and reconstruction-based approaches. In addition, we address the importance of a termination condition for iterative methods and propose a termination criterion inspired by the anomaly detection objective. We validate that our framework improves the robustness of the anomaly detection models under different levels of contamination ratios on five anomaly detection benchmark datasets and two image datasets. On various contaminated datasets, our framework improves the performance of three representative anomaly detection methods, measured by area under the ROC curve.

6.
IEEE Trans Pattern Anal Mach Intell ; 44(11): 7348-7362, 2022 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-34648432

RESUMO

We introduce dense relational captioning, a novel image captioning task which aims to generate multiple captions with respect to relational information between objects in a visual scene. Relational captioning provides explicit descriptions for each relationship between object combinations. This framework is advantageous in both diversity and amount of information, leading to a comprehensive image understanding based on relationships, e.g., relational proposal generation. For relational understanding between objects, the part-of-speech (POS; i.e., subject-object-predicate categories) can be a valuable prior information to guide the causal sequence of words in a caption. We enforce our framework to learn not only to generate captions but also to understand the POS of each word. To this end, we propose the multi-task triple-stream network (MTTSNet) which consists of three recurrent units responsible for each POS which is trained by jointly predicting the correct captions and POS for each word. In addition, we found that the performance of MTTSNet can be improved by modulating the object embeddings with an explicit relational module. We demonstrate that our proposed model can generate more diverse and richer captions, via extensive experimental analysis on large scale datasets and several metrics. Then, we present applications of our framework to holistic image captioning, scene graph generation, and retrieval tasks.

7.
IEEE Trans Pattern Anal Mach Intell ; 44(9): 5460-5471, 2022 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-34057889

RESUMO

Taking selfies has become one of the major photographic trends of our time. In this study, we focus on the selfie stick, on which a camera is mounted to take selfies. We observe that a camera on a selfie stick typically travels through a particular type of trajectory around a sphere. Based on this finding, we propose a robust, efficient, and optimal estimation method for relative camera pose between two images captured by a camera mounted on a selfie stick. We exploit the special geometric structure of camera motion constrained by a selfie stick and define this motion as spherical joint motion. Utilizing a novel parametrization and calibration scheme, we demonstrate that the pose estimation problem can be reduced to a 3-degrees of freedom (DoF) search problem, instead of a generic 6-DoF problem. This facilitates the derivation of an efficient branch-and-bound optimization method that guarantees a global optimal solution, even in the presence of outliers. Furthermore, as a simplified case of spherical joint motion, we introduce selfie motion, which has a fewer number of DoF than spherical joint motion. We validate the performance and guaranteed optimality of our method on both synthetic and real-world data. Additionally, we demonstrate the applicability of the proposed method for two applications: refocusing and stylization.

8.
IEEE Trans Pattern Anal Mach Intell ; 43(5): 1605-1619, 2021 05.
Artigo em Inglês | MEDLINE | ID: mdl-31722472

RESUMO

Visual events are usually accompanied by sounds in our daily lives. However, can the machines learn to correlate the visual scene and sound, as well as localize the sound source only by observing them like humans? To investigate its empirical learnability, in this work we first present a novel unsupervised algorithm to address the problem of localizing sound sources in visual scenes. In order to achieve this goal, a two-stream network structure which handles each modality with attention mechanism is developed for sound source localization. The network naturally reveals the localized response in the scene without human annotation. In addition, a new sound source dataset is developed for performance evaluation. Nevertheless, our empirical evaluation shows that the unsupervised method generates false conclusions in some cases. Thereby, we show that this false conclusion cannot be fixed without human prior knowledge due to the well-known correlation and causality mismatch misconception. To fix this issue, we extend our network to the supervised and semi-supervised network settings via a simple modification due to the general architecture of our two-stream network. We show that the false conclusions can be effectively corrected even with a small amount of supervision, i.e., semi-supervised setup. Furthermore, we present the versatility of the learned audio and visual embeddings on the cross-modal content alignment and we extend this proposed algorithm to a new application, sound saliency based automatic camera view panning in 360 degree videos.

9.
IEEE Trans Pattern Anal Mach Intell ; 42(10): 2656-2669, 2020 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-30969915

RESUMO

In this work, we describe man-made structures via an appropriate structure assumption, called the Atlanta world assumption, which contains a vertical direction (typically the gravity direction) and a set of horizontal directions orthogonal to the vertical direction. Contrary to the commonly used Manhattan world assumption, the horizontal directions in Atlanta world are not necessarily orthogonal to each other. While Atlanta world can encompass a wider range of scenes, this makes the search space much larger and the problem more challenging. Our input data is a set of surface normals, for example, acquired from RGB-D cameras or 3D laser scanners, as well as lines from calibrated images. Given this input data, we propose the first globally optimal method of inlier set maximization for Atlanta direction estimation. We define a novel search space for Atlanta world, as well as its parametrization, and solve this challenging problem using a branch-and-bound (BnB) framework. To alleviate the computational bottleneck in BnB, i.e., the bound computation, we present two bound computation strategies: rectangular bound and slice bound in an efficient measurement domain, i.e., the extended Gaussian image (EGI). In addition, we propose an efficient two-stage method which automatically estimates the number of horizontal directions of a scene. Experimental results with synthetic and real-world datasets have successfully confirmed the validity of our approach.

10.
IEEE Trans Pattern Anal Mach Intell ; 41(3): 682-696, 2019 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-29993475

RESUMO

Most man-made environments, such as urban and indoor scenes, consist of a set of parallel and orthogonal planar structures. These structures are approximated by the Manhattan world assumption, in which notion can be represented as a Manhattan frame (MF). Given a set of inputs such as surface normals or vanishing points, we pose an MF estimation problem as a consensus set maximization that maximizes the number of inliers over the rotation search space. Conventionally, this problem can be solved by a branch-and-bound framework, which mathematically guarantees global optimality. However, the computational time of the conventional branch-and-bound algorithms is rather far from real-time. In this paper, we propose a novel bound computation method on an efficient measurement domain for MF estimation, i.e., the extended Gaussian image (EGI). By relaxing the original problem, we can compute the bound with a constant complexity, while preserving global optimality. Furthermore, we quantitatively and qualitatively demonstrate the performance of the proposed method for various synthetic and real-world data. We also show the versatility of our approach through three different applications: extension to multiple MF estimation, 3D rotation based video stabilization, and vanishing point estimation (line clustering).

11.
IEEE Trans Pattern Anal Mach Intell ; 40(2): 376-391, 2018 02.
Artigo em Inglês | MEDLINE | ID: mdl-28278459

RESUMO

Rank minimization can be converted into tractable surrogate problems, such as Nuclear Norm Minimization (NNM) and Weighted NNM (WNNM). The problems related to NNM, or WNNM, can be solved iteratively by applying a closed-form proximal operator, called Singular Value Thresholding (SVT), or Weighted SVT, but they suffer from high computational cost of Singular Value Decomposition (SVD) at each iteration. We propose a fast and accurate approximation method for SVT, that we call fast randomized SVT (FRSVT), with which we avoid direct computation of SVD. The key idea is to extract an approximate basis for the range of the matrix from its compressed matrix. Given the basis, we compute partial singular values of the original matrix from the small factored matrix. In addition, by developping a range propagation method, our method further speeds up the extraction of approximate basis at each iteration. Our theoretical analysis shows the relationship between the approximation bound of SVD and its effect to NNM via SVT. Along with the analysis, our empirical results quantitatively and qualitatively show that our approximation rarely harms the convergence of the host algorithms. We assess the efficiency and accuracy of the proposed method on various computer vision problems, e.g., subspace clustering, weather artifact removal, and simultaneous multi-image alignment and rectification.

12.
IEEE Trans Pattern Anal Mach Intell ; 38(4): 744-58, 2016 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-26353362

RESUMO

Robust Principal Component Analysis (RPCA) via rank minimization is a powerful tool for recovering underlying low-rank structure of clean data corrupted with sparse noise/outliers. In many low-level vision problems, not only it is known that the underlying structure of clean data is low-rank, but the exact rank of clean data is also known. Yet, when applying conventional rank minimization for those problems, the objective function is formulated in a way that does not fully utilize a priori target rank information about the problems. This observation motivates us to investigate whether there is a better alternative solution when using rank minimization. In this paper, instead of minimizing the nuclear norm, we propose to minimize the partial sum of singular values, which implicitly encourages the target rank constraint. Our experimental analyses show that, when the number of samples is deficient, our approach leads to a higher success rate than conventional rank minimization, while the solutions obtained by the two approaches are almost identical when the number of samples is more than sufficient. We apply our approach to various low-level vision problems, e.g., high dynamic range imaging, motion edge detection, photometric stereo, image alignment and recovery, and show that our results outperform those obtained by the conventional nuclear norm rank minimization method.

13.
IEEE Trans Pattern Anal Mach Intell ; 37(6): 1219-32, 2015 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-26357344

RESUMO

This paper introduces a new high dynamic range (HDR) imaging algorithm which utilizes rank minimization. Assuming a camera responses linearly to scene radiance, the input low dynamic range (LDR) images captured with different exposure time exhibit a linear dependency and form a rank-1 matrix when stacking intensity of each corresponding pixel together. In practice, misalignments caused by camera motion, presences of moving objects, saturations and image noise break the rank-1 structure of the LDR images. To address these problems, we present a rank minimization algorithm which simultaneously aligns LDR images and detects outliers for robust HDR generation. We evaluate the performances of our algorithm systematically using synthetic examples and qualitatively compare our results with results from the state-of-the-art HDR algorithms using challenging real world examples.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA