Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
1.
Gastric Cancer ; 27(3): 539-547, 2024 05.
Artículo en Inglés | MEDLINE | ID: mdl-38240891

RESUMEN

BACKGROUNDS: Cycle-consistent generative adversarial network (CycleGAN) is a deep neural network model that performs image-to-image translations. We generated virtual indigo carmine (IC) chromoendoscopy images of gastric neoplasms using CycleGAN and compared their diagnostic performance with that of white light endoscopy (WLE). METHODS: WLE and IC images of 176 patients with gastric neoplasms who underwent endoscopic resection were obtained. We used 1,633 images (911 WLE and 722 IC) of 146 cases in the training dataset to develop virtual IC images using CycleGAN. The remaining 30 WLE images were translated into 30 virtual IC images using the trained CycleGAN and used for validation. The lesion borders were evaluated by 118 endoscopists from 22 institutions using the 60 paired virtual IC and WLE images. The lesion area concordance rate and successful whole-lesion diagnosis were compared. RESULTS: The lesion area concordance rate based on the pathological diagnosis in virtual IC was lower than in WLE (44.1% vs. 48.5%, p < 0.01). The successful whole-lesion diagnosis was higher in the virtual IC than in WLE images; however, the difference was insignificant (28.2% vs. 26.4%, p = 0.11). Conversely, subgroup analyses revealed a significantly higher diagnosis in virtual IC than in WLE for depressed morphology (41.9% vs. 36.9%, p = 0.02), differentiated histology (27.6% vs. 24.8%, p = 0.02), smaller lesion size (42.3% vs. 38.3%, p = 0.01), and assessed by expert endoscopists (27.3% vs. 23.6%, p = 0.03). CONCLUSIONS: The diagnostic ability of virtual IC was higher for some lesions, but not completely superior to that of WLE. Adjustments are required to improve the imaging system's performance.


Asunto(s)
Aprendizaje Profundo , Neoplasias Gástricas , Humanos , Neoplasias Gástricas/diagnóstico por imagen , Neoplasias Gástricas/cirugía , Endoscopía/métodos , Carmin de Índigo
2.
Sensors (Basel) ; 23(17)2023 Aug 30.
Artículo en Inglés | MEDLINE | ID: mdl-37687990

RESUMEN

A camera captures multidimensional information of the real world by convolving it into two dimensions using a sensing matrix. The original multidimensional information is then reconstructed from captured images. Traditionally, multidimensional information has been captured by uniform sampling, but by optimizing the sensing matrix, we can capture images more efficiently and reconstruct multidimensional information with high quality. Although compressive video sensing requires random sampling as a theoretical optimum, when designing the sensing matrix in practice, there are many hardware limitations (such as exposure and color filter patterns). Existing studies have found random sampling is not always the best solution for compressive sensing because the optimal sampling pattern is related to the scene context, and it is hard to manually design a sampling pattern and reconstruction algorithm. In this paper, we propose an end-to-end learning approach that jointly optimizes the sampling pattern as well as the reconstruction decoder. We applied this deep sensing approach to the video compressive sensing problem. We modeled the spatio-temporal sampling and color filter pattern using a convolutional neural network constrained by hardware limitations during network training. We demonstrated that the proposed method performs better than the manually designed method in gray-scale video and color video acquisitions.

3.
Sensors (Basel) ; 17(12)2017 Dec 01.
Artículo en Inglés | MEDLINE | ID: mdl-29194407

RESUMEN

Color image demosaicking for the Bayer color filter array is an essential image processing operation for acquiring high-quality color images. Recently, residual interpolation (RI)-based algorithms have demonstrated superior demosaicking performance over conventional color difference interpolation-based algorithms. In this paper, we propose adaptive residual interpolation (ARI) that improves existing RI-based algorithms by adaptively combining two RI-based algorithms and selecting a suitable iteration number at each pixel. These are performed based on a unified criterion that evaluates the validity of an RI-based algorithm. Experimental comparisons using standard color image datasets demonstrate that ARI can improve existing RI-based algorithms by more than 0.6 dB in the color peak signal-to-noise ratio and can outperform state-of-the-art algorithms based on training images. We further extend ARI for a multispectral filter array, in which more than three spectral bands are arrayed, and demonstrate that ARI can achieve state-of-the-art performance also for the task of multispectral image demosaicking.

4.
IEEE Trans Pattern Anal Mach Intell ; 45(7): 8798-8812, 2023 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-37015666

RESUMEN

A polarization camera has great potential for 3D reconstruction since the angle of polarization (AoP) and the degree of polarization (DoP) of reflected light are related to an object's surface normal. In this paper, we propose a novel 3D reconstruction method called Polarimetric Multi-View Inverse Rendering (Polarimetric MVIR) that effectively exploits geometric, photometric, and polarimetric cues extracted from input multi-view color-polarization images. We first estimate camera poses and an initial 3D model by geometric reconstruction with a standard structure-from-motion and multi-view stereo pipeline. We then refine the initial model by optimizing photometric rendering errors and polarimetric errors using multi-view RGB, AoP, and DoP images, where we propose a novel polarimetric cost function that enables an effective constraint on the estimated surface normal of each vertex, while considering four possible ambiguous azimuth angles revealed from the AoP measurement. The weight for the polarimetric cost is effectively determined based on the DoP measurement, which is regarded as the reliability of polarimetric information. Experimental results using both synthetic and real data demonstrate that our Polarimetric MVIR can reconstruct a detailed 3D shape without assuming a specific surface material and lighting condition.

5.
IEEE Trans Pattern Anal Mach Intell ; 44(4): 2074-2088, 2022 04.
Artículo en Inglés | MEDLINE | ID: mdl-33074802

RESUMEN

Visual localization enables autonomous vehicles to navigate in their surroundings and augmented reality applications to link virtual to real worlds. Practical visual localization approaches need to be robust to a wide variety of viewing conditions, including day-night changes, as well as weather and seasonal variations, while providing highly accurate six degree-of-freedom (6DOF) camera pose estimates. In this paper, we extend three publicly available datasets containing images captured under a wide variety of viewing conditions, but lacking camera pose information, with ground truth pose information, making evaluation of the impact of various factors on 6DOF camera pose estimation accuracy possible. We also discuss the performance of state-of-the-art localization approaches on these datasets. Additionally, we release around half of the poses for all conditions, and keep the remaining half private as a test set, in the hopes that this will stimulate research on long-term visual localization, learned local image features, and related research areas. Our datasets are available at visuallocalization.net, where we are also hosting a benchmarking server for automatic evaluation of results on the test set. The presented state-of-the-art results are to a large degree based on submissions to our server.


Asunto(s)
Algoritmos
6.
IEEE J Transl Eng Health Med ; 9: 1700211, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-33796417

RESUMEN

Gastric endoscopy is a golden standard in the clinical process that enables medical practitioners to diagnose various lesions inside a patient's stomach. If a lesion is found, a success in identifying the location of the found lesion relative to the global view of the stomach will lead to better decision making for the next clinical treatment. Our previous research showed that the lesion localization could be achieved by reconstructing the whole stomach shape from chromoendoscopic indigo carmine (IC) dye-sprayed images using a structure-from-motion (SfM) pipeline. However, spraying the IC dye to the whole stomach requires additional time, which is not desirable for both patients and practitioners. Our objective is to propose an alternative way to achieve whole stomach 3D reconstruction without the need of the IC dye. We generate virtual IC-sprayed (VIC) images based on image-to-image style translation trained on unpaired real no-IC and IC-sprayed images, where we have investigated the effect of input and output color channel selection for generating the VIC images. We validate our reconstruction results by comparing them with the results using real IC-sprayed images and confirm that the obtained stomach 3D structures are comparable to each other. We also propose a local reconstruction technique to obtain a more detailed surface and texture around an interesting region. The proposed method achieves the whole stomach reconstruction without the need of real IC dye using SfM. We have found that translating no-IC green-channel images to IC-sprayed red-channel images gives the best SfM reconstruction result. Clinical impact We offer a method of the frame localization and local 3D reconstruction of a found gastric lesion using standard endoscopy images, leading to better clinical decision.


Asunto(s)
Procedimientos Quirúrgicos del Sistema Digestivo , Imagenología Tridimensional , Endoscopía , Humanos , Carmin de Índigo , Estómago/diagnóstico por imagen
7.
IEEE Trans Pattern Anal Mach Intell ; 43(3): 814-829, 2021 03.
Artículo en Inglés | MEDLINE | ID: mdl-31535984

RESUMEN

Accurate visual localization is a key technology for autonomous navigation. 3D structure-based methods employ 3D models of the scene to estimate the full 6 degree-of-freedom (DOF) pose of a camera very accurately. However, constructing (and extending) large-scale 3D models is still a significant challenge. In contrast, 2D image retrieval-based methods only require a database of geo-tagged images, which is trivial to construct and to maintain. They are often considered inaccurate since they only approximate the positions of the cameras. Yet, the exact camera pose can theoretically be recovered when enough relevant database images are retrieved. In this paper, we demonstrate experimentally that large-scale 3D models are not strictly necessary for accurate visual localization. We create reference poses for a large and challenging urban dataset. Using these poses, we show that combining image-based methods with local reconstructions results in a higher pose accuracy compared to state-of-the-art structure-based methods, albeight at higher run-time costs. We show that some of these run-time costs can be alleviated by exploiting known database image poses. Our results suggest that we might want to reconsider the need for large-scale 3D models in favor of more local models, but also that further research is necessary to accelerate the local reconstruction process.

8.
Annu Int Conf IEEE Eng Med Biol Soc ; 2021: 3547-3552, 2021 11.
Artículo en Inglés | MEDLINE | ID: mdl-34892005

RESUMEN

Gastroendoscopy has been a clinical standard for diagnosing and treating conditions that affect a part of a patient's digestive system, such as the stomach. Despite the fact that gastroendoscopy has a lot of advantages for patients, there exist some challenges for practitioners, such as the lack of 3D perception, including the depth and the endoscope pose information. Such challenges make navigating the endoscope and localizing any found lesion in a digestive tract difficult. To tackle these problems, deep learning-based approaches have been proposed to provide monocular gastroendoscopy with additional yet important depth and pose information. In this paper, we propose a novel supervised approach to train depth and pose estimation networks using consecutive endoscopy images to assist the endoscope navigation in the stomach. We firstly generate real depth and pose training data using our previously proposed whole stomach 3D reconstruction pipeline to avoid poor generalization ability between computer-generated (CG) models and real data for the stomach. In addition, we propose a novel generalized photometric loss function to avoid the complicated process of finding proper weights for balancing the depth and the pose loss terms, which is required for existing direct depth and pose supervision approaches. We then experimentally show that our proposed generalized loss performs better than existing direct supervision losses.


Asunto(s)
Endoscopios , Imagenología Tridimensional , Simulación por Computador , Endoscopía , Humanos
9.
IEEE Trans Pattern Anal Mach Intell ; 43(4): 1293-1307, 2021 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-31722474

RESUMEN

We seek to predict the 6 degree-of-freedom (6DoF) pose of a query photograph with respect to a large indoor 3D map. The contributions of this work are three-fold. First, we develop a new large-scale visual localization method targeted for indoor spaces. The method proceeds along three steps: (i) efficient retrieval of candidate poses that scales to large-scale environments, (ii) pose estimation using dense matching rather than sparse local features to deal with weakly textured indoor scenes, and (iii) pose verification by virtual view synthesis that is robust to significant changes in viewpoint, scene layout, and occlusion. Second, we release a new dataset with reference 6DoF poses for large-scale indoor localization. Query photographs are captured by mobile phones at a different time than the reference 3D map, thus presenting a realistic indoor localization scenario. Third, we demonstrate that our method significantly outperforms current state-of-the-art indoor localization approaches on this new challenging data. Code and data are publicly available.

10.
Annu Int Conf IEEE Eng Med Biol Soc ; 2020: 2634-2637, 2020 07.
Artículo en Inglés | MEDLINE | ID: mdl-33018547

RESUMEN

In this paper, we propose a novel video-based remote heart rate (HR) estimation method based on 3D facial landmarks. The key contributions in our method are twofold: (i) We introduce 3D facial landmarks detection to the video-based HR estimation and (ii) we propose a novel face patch visibility check manner based on the face patch normal in the 3D space. We experimentally demonstrate that, compared with baseline methods using 2D facial landmarks, our proposed method using 3D facial landmarks improves the robustness of HR estimation to head rotations and partial face occlusion. We also demonstrate that our visibility check is effective for selecting sufficiently visible face patches, contributing to the improvement of HR estimation accuracy.


Asunto(s)
Cara , Imagenología Tridimensional , Frecuencia Cardíaca
11.
Annu Int Conf IEEE Eng Med Biol Soc ; 2020: 1848-1852, 2020 07.
Artículo en Inglés | MEDLINE | ID: mdl-33018360

RESUMEN

Gastric endoscopy is a standard clinical process that enables medical practitioners to diagnose various lesions inside a patient's stomach. If any lesion is found, it is very important to perceive the location of the lesion relative to the global view of the stomach. Our previous research showed that this could be addressed by reconstructing the whole stomach shape from chromoendoscopic images using a structure-from-motion (SfM) pipeline, in which indigo carmine (IC) blue dye-sprayed images were used to increase feature matches for SfM by enhancing stomach surface's textures. However, spraying the IC dye to the whole stomach requires additional time, labor, and cost, which is not desirable for patients and practitioners. In this paper, we propose an alternative way to achieve whole stomach 3D reconstruction without the need of the IC dye by generating virtual IC-sprayed (VIC) images based on image-to-image style translation trained on unpaired real no-IC and IC-sprayed images. We have specifically investigated the effect of input and output color channel selection for generating the VIC images and found that translating no-IC green-channel images to IC-sprayed red-channel images gives the best SfM reconstruction result.


Asunto(s)
Procedimientos Quirúrgicos del Sistema Digestivo , Imagenología Tridimensional , Carmín , Humanos , Carmin de Índigo , Estómago/diagnóstico por imagen
12.
Annu Int Conf IEEE Eng Med Biol Soc ; 2019: 6525-6528, 2019 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-31947336

RESUMEN

Inter-beat interval (IBI) and heart rate variability (HRV) are important cardiac parameters that provide physiological and emotional states of a person. In this paper, we present a framework for accurate IBI and HRV estimation from a facial video based on the reliability of extracted blood volume pulse (BVP) signals. Our framework first extracts candidate BVP signals from randomly sampled multiple face patches. The BVP signals are then assessed based on a reliability metric to select the most reliable BVP signal, from which IBI and HRV are calculated. In experiments, we evaluate three reliability metrics and demonstrate that our framework can estimate IBI and HRV more accurately than a conventional single face region-based framework.


Asunto(s)
Algoritmos , Volumen Sanguíneo , Cara , Frecuencia Cardíaca , Reproducibilidad de los Resultados
13.
IEEE J Transl Eng Health Med ; 7: 3300310, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-32309059

RESUMEN

Gastric endoscopy is a common clinical practice that enables medical doctors to diagnose various lesions inside a stomach. In order to identify the location of a gastric lesion such as early cancer and a peptic ulcer within the stomach, this work addresses to reconstruct the color-textured 3D model of a whole stomach from a standard monocular endoscope video and localize any selected video frame to the 3D model. We examine how to enable structure-from-motion (SfM) to reconstruct the whole shape of a stomach from endoscope images, which is a challenging task due to the texture-less nature of the stomach surface. We specifically investigate the combined effect of chromo-endoscopy and color channel selection on SfM to increase the number of feature points. We also design a plane fitting-based algorithm for 3D point outliers removal to improve the 3D model quality. We show that whole stomach 3D reconstruction can be achieved (more than 90% of the frames can be reconstructed) by using red channel images captured under chromo-endoscopy by spreading indigo carmine (IC) dye on the stomach surface. In experimental results, we demonstrate the reconstructed 3D models for seven subjects and the application of lesion localization and reconstruction. The methodology and results presented in this paper could offer some valuable reference to other researchers and also could be an excellent tool for gastric surgeons in various computer-aided diagnosis applications.

14.
Annu Int Conf IEEE Eng Med Biol Soc ; 2019: 3900-3904, 2019 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-31946725

RESUMEN

Gastric endoscopy is a common clinical practice that enables medical doctors to diagnose the stomach inside a body. In order to identify a gastric lesion's location such as early gastric cancer within the stomach, this work addressed to reconstruct the 3D shape of a whole stomach with color texture information generated from a standard monocular endoscope video. Previous works have tried to reconstruct the 3D structures of various organs from endoscope images. However, they are mainly focused on a partial surface. In this work, we investigated how to enable structure-from-motion (SfM) to reconstruct the whole shape of a stomach from a standard endoscope video. We specifically investigated the combined effect of chromo-endoscopy and color channel selection on SfM. Our study found that 3D reconstruction of the whole stomach can be achieved by using red channel images captured under chromo-endoscopy by spreading indigo carmine (IC) dye on the stomach surface.


Asunto(s)
Endoscopía , Imagenología Tridimensional , Estómago/diagnóstico por imagen , Humanos , Carmin de Índigo , Movimiento (Física) , Neoplasias Gástricas/diagnóstico por imagen
15.
Neural Netw ; 105: 197-205, 2018 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-29870927

RESUMEN

We propose a coupled convolution layer comprising multiple parallel convolutions with mutually constrained filters. Inspired by biological human vision mechanism, we constrain the convolution filters such that one set of filter weights should be geometrically rotated, mirrored, or be the negative of the other. Our analysis suggests that the coupled convolution layer is more effective for lower layer where feature maps preserve geometric properties. Experimental comparisons demonstrate that the proposed coupled convolution layer performs slightly better than the original layer while decreasing the number of parameters. We evaluate its effect compared to non-constrained convolution layer using the CIFAR-10, CIFAR-100, and PlanktonSet 1.0 datasets.


Asunto(s)
Aprendizaje Automático , Redes Neurales de la Computación , Neuronas Retinianas/fisiología , Humanos , Modelos Neurológicos
16.
IEEE Trans Pattern Anal Mach Intell ; 40(2): 257-271, 2018 02.
Artículo en Inglés | MEDLINE | ID: mdl-28207385

RESUMEN

We address the problem of large-scale visual place recognition for situations where the scene undergoes a major change in appearance, for example, due to illumination (day/night), change of seasons, aging, or structural modifications over time such as buildings being built or destroyed. Such situations represent a major challenge for current large-scale place recognition methods. This work has the following three principal contributions. First, we demonstrate that matching across large changes in the scene appearance becomes much easier when both the query image and the database image depict the scene from approximately the same viewpoint. Second, based on this observation, we develop a new place recognition approach that combines (i) an efficient synthesis of novel views with (ii) a compact indexable image representation. Third, we introduce a new challenging dataset of 1,125 camera-phone query images of Tokyo that contain major changes in illumination (day, sunset, night) as well as structural changes in the scene. We demonstrate that the proposed approach significantly outperforms other large-scale place recognition techniques on this challenging data.

17.
Annu Int Conf IEEE Eng Med Biol Soc ; 2018: 5676-5680, 2018 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-30441624

RESUMEN

In this paper, we propose a novel heart rate (HR) estimation method using simultaneously recorded RGB and near-infrared (NIR) face videos. The key idea of our method is to automatically select suitable face patches for HR estimation in both spatial and spectral domains. The spatial and spectral face patch selection enables us to robustly estimate HR under various situations, including scenes under which existing RGB camera-based methods fail to accurately estimate HR. For a challenging scene in low light and with light fluctuations, our method can successfully estimate HR for all 20 subjects $( \pm 3$ beats per minute), while the RGB camera-based methods succeed only for 25% of the subjects.


Asunto(s)
Cara , Frecuencia Cardíaca , Humanos
18.
IEEE Trans Image Process ; 25(3): 1288-300, 2016 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-26780794

RESUMEN

In this paper, we propose residual interpolation (RI) as an alternative to color difference interpolation, which is a widely accepted technique for color image demosaicking. Our proposed RI performs the interpolation in a residual domain, where the residuals are differences between observed and tentatively estimated pixel values. Our hypothesis for the RI is that if image interpolation is performed in a domain with a smaller Laplacian energy, its accuracy is improved. Based on the hypothesis, we estimate the tentative pixel values to minimize the Laplacian energy of the residuals. We incorporate the RI into the gradient-based threshold free algorithm, which is one of the state-of-the-art Bayer demosaicking algorithms. Experimental results demonstrate that our proposed demosaicking algorithm using the RI surpasses the state-of-the-art algorithms for the Kodak, the IMAX, and the beyond Kodak data sets.

19.
IEEE Trans Pattern Anal Mach Intell ; 37(11): 2346-59, 2015 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-26440272

RESUMEN

Repeated structures such as building facades, fences or road markings often represent a significant challenge for place recognition. Repeated structures are notoriously hard for establishing correspondences using multi-view geometry. They violate the feature independence assumed in the bag-of-visual-words representation which often leads to over-counting evidence and significant degradation of retrieval performance. In this work we show that repeated structures are not a nuisance but, when appropriately represented, they form an important distinguishing feature for many places. We describe a representation of repeated structures suitable for scalable retrieval and geometric verification. The retrieval is based on robust detection of repeated image structures and a suitable modification of weights in the bag-of-visual-word model. We also demonstrate that the explicit detection of repeated patterns is beneficial for robust visual word matching for geometric verification. Place recognition results are shown on datasets of street-level imagery from Pittsburgh and San Francisco demonstrating significant gains in recognition performance compared to the standard bag-of-visual-words baseline as well as the more recently proposed burstiness weighting and Fisher vector encoding.

20.
IEEE Trans Image Process ; 24(10): 3048-59, 2015 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-26011882

RESUMEN

Single-sensor imaging using the Bayer color filter array (CFA) and demosaicking is well established for current compact and low-cost color digital cameras. An extension from the CFA to a multispectral filter array (MSFA) enables us to acquire a multispectral image in one shot without increased size or cost. However, multispectral demosaicking for the MSFA has been a challenging problem because of very sparse sampling of each spectral band in the MSFA. In this paper, we propose a high-performance multispectral demosaicking algorithm, and at the same time, a novel MSFA pattern that is suitable for our proposed algorithm. Our key idea is the use of the guided filter to interpolate each spectral band. To generate an effective guide image, in our proposed MSFA pattern, we maintain the sampling density of the G -band as high as the Bayer CFA, and we array each spectral band so that an adaptive kernel can be estimated directly from raw MSFA data. Given these two advantages, we effectively generate the guide image from the most densely sampled G -band using the adaptive kernel. In the experiments, we demonstrate that our proposed algorithm with our proposed MSFA pattern outperforms existing algorithms and provides better color fidelity compared with a conventional color imaging system with the Bayer CFA. We also show some real applications using a multispectral camera prototype we built.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA