Búsqueda | Portal de Búsqueda de la BVS Ecuador

1.

BigNeuron: a resource to benchmark and predict performance of algorithms for automated tracing of neurons in light microscopy datasets.

Manubens-Gil, Linus; Zhou, Zhi; Chen, Hanbo; Ramanathan, Arvind; Liu, Xiaoxiao; Liu, Yufeng; Bria, Alessandro; Gillette, Todd; Ruan, Zongcai; Yang, Jian; Radojevic, Miroslav; Zhao, Ting; Cheng, Li; Qu, Lei; Liu, Siqi; Bouchard, Kristofer E; Gu, Lin; Cai, Weidong; Ji, Shuiwang; Roysam, Badrinath; Wang, Ching-Wei; Yu, Hongchuan; Sironi, Amos; Iascone, Daniel Maxim; Zhou, Jie; Bas, Erhan; Conde-Sousa, Eduardo; Aguiar, Paulo; Li, Xiang; Li, Yujie; Nanda, Sumit; Wang, Yuan; Muresan, Leila; Fua, Pascal; Ye, Bing; He, Hai-Yan; Staiger, Jochen F; Peter, Manuel; Cox, Daniel N; Simonneau, Michel; Oberlaender, Marcel; Jefferis, Gregory; Ito, Kei; Gonzalez-Bellido, Paloma; Kim, Jinhyun; Rubel, Edwin; Cline, Hollis T; Zeng, Hongkui; Nern, Aljoscha; Chiang, Ann-Shyn.

Nat Methods ; 20(6): 824-835, 2023 Jun.

Artículo en Inglés | MEDLINE | ID: mdl-37069271

RESUMEN

BigNeuron is an open community bench-testing platform with the goal of setting open standards for accurate and fast automatic neuron tracing. We gathered a diverse set of image volumes across several species that is representative of the data obtained in many neuroscience laboratories interested in neuron tracing. Here, we report generated gold standard manual annotations for a subset of the available imaging datasets and quantified tracing quality for 35 automatic tracing algorithms. The goal of generating such a hand-curated diverse dataset is to advance the development of tracing algorithms and enable generalizable benchmarking. Together with image quality features, we pooled the data in an interactive web application that enables users and developers to perform principal component analysis, t-distributed stochastic neighbor embedding, correlation and clustering, visualization of imaging and tracing data, and benchmarking of automatic tracing algorithms in user-defined data subsets. The image quality metrics explain most of the variance in the data, followed by neuromorphological features related to neuron size. We observed that diverse algorithms can provide complementary information to obtain accurate results and developed a method to iteratively combine methods and generate consensus reconstructions. The consensus trees obtained provide estimates of the neuron structure ground truth that typically outperform single algorithms in noisy datasets. However, specific algorithms may outperform the consensus tree strategy in specific imaging conditions. Finally, to aid users in predicting the most accurate automatic tracing results without manual annotations for comparison, we used support vector machine regression to predict reconstruction quality given an image volume and a set of automatic tracings.

Asunto(s)

Benchmarking , Microscopía , Microscopía/métodos , Imagenología Tridimensional/métodos , Neuronas/fisiología , Algoritmos

2.

LiftPose3D, a deep learning-based approach for transforming two-dimensional to three-dimensional poses in laboratory animals.

Gosztolai, Adam; Günel, Semih; Lobato-Ríos, Victor; Pietro Abrate, Marco; Morales, Daniel; Rhodin, Helge; Fua, Pascal; Ramdya, Pavan.

Nat Methods ; 18(8): 975-981, 2021 08.

Artículo en Inglés | MEDLINE | ID: mdl-34354294

RESUMEN

Markerless three-dimensional (3D) pose estimation has become an indispensable tool for kinematic studies of laboratory animals. Most current methods recover 3D poses by multi-view triangulation of deep network-based two-dimensional (2D) pose estimates. However, triangulation requires multiple synchronized cameras and elaborate calibration protocols that hinder its widespread adoption in laboratory studies. Here we describe LiftPose3D, a deep network-based method that overcomes these barriers by reconstructing 3D poses from a single 2D camera view. We illustrate LiftPose3D's versatility by applying it to multiple experimental systems using flies, mice, rats and macaques, and in circumstances where 3D triangulation is impractical or impossible. Our framework achieves accurate lifting for stereotypical and nonstereotypical behaviors from different camera angles. Thus, LiftPose3D permits high-quality 3D pose estimation in the absence of complex camera arrays and tedious calibration procedures and despite occluded body parts in freely behaving animals.

Asunto(s)

Algoritmos , Animales de Laboratorio/fisiología , Aprendizaje Profundo , Imagenología Tridimensional/métodos , Postura/fisiología , Animales , Calibración , Drosophila melanogaster , Femenino , Macaca , Ratones , Ratas

3.

Are Existing Monocular Computer Vision-Based 3D Motion Capture Approaches Ready for Deployment? A Methodological Study on the Example of Alpine Skiing.

Ostrek, Mirela; Rhodin, Helge; Fua, Pascal; Müller, Erich; Spörri, Jörg.

Sensors (Basel) ; 19(19)2019 Oct 06.

Artículo en Inglés | MEDLINE | ID: mdl-31590465

RESUMEN

In this study, we compared a monocular computer vision (MCV)-based approach with the golden standard for collecting kinematic data on ski tracks (i.e., video-based stereophotogrammetry) and assessed its deployment readiness for answering applied research questions in the context of alpine skiing. The investigated MCV-based approach predicted the three-dimensional human pose and ski orientation based on the image data from a single camera. The data set used for training and testing the underlying deep nets originated from a field experiment with six competitive alpine skiers. The normalized mean per joint position error of the MVC-based approach was found to be 0.08 ± 0.01m. Knee flexion showed an accuracy and precision (in parenthesis) of 0.4 ± 7.1° (7.2 ± 1.5°) for the outside leg, and -0.2 ± 5.0° (6.7 ± 1.1°) for the inside leg. For hip flexion, the corresponding values were -0.4 ± 6.1° (4.4° ± 1.5°) and -0.7 ± 4.7° (3.7 ± 1.0°), respectively. The accuracy and precision of skiing-related metrics were revealed to be 0.03 ± 0.01 m (0.01 ± 0.00 m) for relative center of mass position, -0.1 ± 3.8° (3.4 ± 0.9) for lean angle, 0.01 ± 0.03 m (0.02 ± 0.01 m) for center of mass to outside ankle distance, 0.01 ± 0.05 m (0.03 ± 0.01 m) for fore/aft position, and 0.00 ± 0.01 m2 (0.01 ± 0.00 m2) for drag area. Such magnitudes can be considered acceptable for detecting relevant differences in the context of alpine skiing.

4.

Dendritic tree extraction from noisy maximum intensity projection images in C. elegans.

Greenblum, Ayala; Sznitman, Raphael; Fua, Pascal; Arratia, Paulo E; Oren, Meital; Podbilewicz, Benjamin; Sznitman, Josué.

Biomed Eng Online ; 13: 74, 2014 Jun 12.

Artículo en Inglés | MEDLINE | ID: mdl-25012210

RESUMEN

BACKGROUND: Maximum Intensity Projections (MIP) of neuronal dendritic trees obtained from confocal microscopy are frequently used to study the relationship between tree morphology and mechanosensory function in the model organism C. elegans. Extracting dendritic trees from noisy images remains however a strenuous process that has traditionally relied on manual approaches. Here, we focus on automated and reliable 2D segmentations of dendritic trees following a statistical learning framework. METHODS: Our dendritic tree extraction (DTE) method uses small amounts of labelled training data on MIPs to learn noise models of texture-based features from the responses of tree structures and image background. Our strategy lies in evaluating statistical models of noise that account for both the variability generated from the imaging process and from the aggregation of information in the MIP images. These noisy models are then used within a probabilistic, or Bayesian framework to provide a coarse 2D dendritic tree segmentation. Finally, some post-processing is applied to refine the segmentations and provide skeletonized trees using a morphological thinning process. RESULTS: Following a Leave-One-Out Cross Validation (LOOCV) method for an MIP databse with available "ground truth" images, we demonstrate that our approach provides significant improvements in tree-structure segmentations over traditional intensity-based methods. Improvements for MIPs under various imaging conditions are both qualitative and quantitative, as measured from Receiver Operator Characteristic (ROC) curves and the yield and error rates in the final segmentations. In a final step, we demonstrate our DTE approach on previously unseen MIP samples including the extraction of skeletonized structures, and compare our method to a state-of-the art dendritic tree tracing software. CONCLUSIONS: Overall, our DTE method allows for robust dendritic tree segmentations in noisy MIPs, outperforming traditional intensity-based methods. Such approach provides a useable segmentation framework, ultimately delivering a speed-up for dendritic tree identification on the user end and a reliable first step towards further morphological characterizations of tree arborization.

Asunto(s)

Caenorhabditis elegans/citología , Dendritas , Procesamiento de Imagen Asistido por Computador/métodos , Microscopía Confocal/métodos , Algoritmos , Animales , Mecanotransducción Celular

5.

Detecting Road Obstacles by Erasing Them.

Lis, Krzysztof; Honari, Sina; Fua, Pascal; Salzmann, Mathieu.

IEEE Trans Pattern Anal Mach Intell ; 46(4): 2450-2460, 2024 Apr.

Artículo en Inglés | MEDLINE | ID: mdl-38019625

RESUMEN

Vehicles can encounter a myriad of obstacles on the road, and it is impossible to record them all beforehand to train a detector. Instead, we select image patches and inpaint them with the surrounding road texture, which tends to remove obstacles from those patches. We then use a network trained to recognize discrepancies between the original patch and the inpainted one, which signals an erased obstacle.

6.

A Closed-Form, Pairwise Solution to Local Non-Rigid Structure-from-Motion.

Parashar, Shaifali; Long, Yuxuan; Salzmann, Mathieu; Fua, Pascal.

IEEE Trans Pattern Anal Mach Intell ; PP2024 Apr 04.

Artículo en Inglés | MEDLINE | ID: mdl-38578851

RESUMEN

A recent trend in Non-Rigid Structure-from-Motion (NRSfM) is to express local, differential constraints between pairs of images, from which the surface normal at any point can be obtained by solving a system of polynomial equations. While this approach is more successful than its counterparts relying on global constraints, the resulting methods face two main problems: First, most of the equation systems they formulate are of high degree and must be solved using computationally expensive polynomial solvers. Some methods use polynomial reduction strategies to simplify the system, but this adds some phantom solutions. In any event, an additional mechanism is employed to pick the best solution, which adds to the computation without any guarantees on the reliability of the solution. Second, these methods formulate constraints between a pair of images. Even if there is enough motion between them, they may suffer from local degeneracies that make the resulting estimates unreliable without any warning mechanism. %Unfortunately, these systems are of high degree with up to five real solutions. Hence, a computationally expensive strategy is required to select a unique solution. Furthermore, they suffer from degeneracies that make the resulting estimates unreliable, without any mechanism to identify this situation. In this paper, we solve these problems for isometric/conformal NRSfM. We show that, under widely applicable assumptions, we can derive a new system of equations in terms of the surface normals, whose two solutions can be obtained in closed-form and can easily be disambiguated locally. Our formalism also allows us to assess how reliable the estimated local normals are and to discard them if they are not. Our experiments show that our reconstructions, obtained from two or more views, are significantly more accurate than those of state-of-the-art methods, while also being faster. %In this paper, we show that, under widely applicable assumptions, we can derive a new system of equations in terms of the surface normals, whose two solutions can be obtained in closed-form and can easily be disambiguated locally. Our formalism also allows us to assess how reliable the estimated local normals are and to discard them if they are not. Our experiments show that our reconstructions, obtained from two or more views, are significantly more accurate than those of state-of-the-art methods, while also being faster.

7.

DeepMesh: Differentiable Iso-Surface Extraction.

Guillard, Benoit; Remelli, Edoardo; Lukoianov, Artem; Yvernay, Pierre; Richter, Stephan R; Bagautdinov, Timur; Baque, Pierre; Fua, Pascal.

IEEE Trans Pattern Anal Mach Intell ; PP2024 Apr 22.

Artículo en Inglés | MEDLINE | ID: mdl-38648137

RESUMEN

Geometric Deep Learning has recently made striking progress with the advent of continuous deep implicit fields. They allow for detailed modeling of watertight surfaces of arbitrary topology while not relying on a 3D Euclidean grid, resulting in a learnable parameterization that is unlimited in resolution. Unfortunately, these methods are often unsuitable for applications that require an explicit mesh-based surface representation because converting an implicit field to such a representation relies on the Marching Cubes algorithm, which cannot be differentiated with respect to the underlying implicit field. In this work, we remove this limitation and introduce a differentiable way to produce explicit surface mesh representations from Deep Implicit Fields. Our key insight is that by reasoning on how implicit field perturbations impact local surface geometry, one can ultimately differentiate the 3D location of surface samples with respect to the underlying deep implicit field. We exploit this to define DeepMesh - an end-to-end differentiable mesh representation that can vary its topology. We validate our theoretical insight through several applications: Single view 3D Reconstruction via Differentiable Rendering, Physically-Driven Shape Optimization, Full Scene 3D Reconstruction from Scans and End-to-End Training. In all cases our end-to-end differentiable parameterization gives us an edge over state-of-the-art algorithms.

8.

Temporal Representation Learning on Monocular Videos for 3D Human Pose Estimation.

Honari, Sina; Constantin, Victor; Rhodin, Helge; Salzmann, Mathieu; Fua, Pascal.

IEEE Trans Pattern Anal Mach Intell ; 45(5): 6415-6427, 2023 05.

Artículo en Inglés | MEDLINE | ID: mdl-36251908

RESUMEN

In this article we propose an unsupervised feature extraction method to capture temporal information on monocular videos, where we detect and encode subject of interest in each frame and leverage contrastive self-supervised (CSS) learning to extract rich latent vectors. Instead of simply treating the latent features of nearby frames as positive pairs and those of temporally-distant ones as negative pairs as in other CSS approaches, we explicitly disentangle each latent vector into a time-variant component and a time-invariant one. We then show that applying contrastive loss only to the time-variant features and encouraging a gradual transition on them between nearby and away frames while also reconstructing the input, extract rich temporal features, well-suited for human pose estimation. Our approach reduces error by about 50% compared to the standard CSS strategies, outperforms other unsupervised single-view methods and matches the performance of multi-view techniques. When 2D pose is available, our approach can extract even richer latent features and improve the 3D pose estimation accuracy, outperforming other state-of-the-art weakly supervised methods.

Asunto(s)

Algoritmos , Aprendizaje , Humanos , Grabación de Cinta de Video

9.

Persistent Homology With Improved Locality Information for More Effective Delineation.

Oner, Doruk; Garin, Adelie; Kozinski, Mateusz; Hess, Kathryn; Fua, Pascal.

IEEE Trans Pattern Anal Mach Intell ; 45(8): 10588-10595, 2023 Aug.

Artículo en Inglés | MEDLINE | ID: mdl-37028072

RESUMEN

Persistent Homology (PH) has been successfully used to train networks to detect curvilinear structures and to improve the topological quality of their results. However, existing methods are very global and ignore the location of topological features. In this paper, we remedy this by introducing a new filtration function that fuses two earlier approaches: thresholding-based filtration, previously used to train deep networks to segment medical images, and filtration with height functions, typically used to compare 2D and 3D shapes. We experimentally demonstrate that deep networks trained using our PH-based loss function yield reconstructions of road networks and neuronal processes that reflect ground-truth connectivity better than networks trained with existing loss functions based on PH.

10.

Adjusting the Ground Truth Annotations for Connectivity-Based Learning to Delineate.

Oner, Doruk; Kozinski, Mateusz; Citraro, Lenoardo; Fua, Pascal.

IEEE Trans Med Imaging ; 41(12): 3675-3685, 2022 Dec.

Artículo en Inglés | MEDLINE | ID: mdl-35862340

RESUMEN

Deep learning-based approaches to delineating 3D structure depend on accurate annotations to train the networks. Yet in practice, people, no matter how conscientious, have trouble precisely delineating in 3D and on a large scale, in part because the data is often hard to interpret visually and in part because the 3D interfaces are awkward to use. In this paper, we introduce a method that explicitly accounts for annotation inaccuracies. To this end, we treat the annotations as active contour models that can deform themselves while preserving their topology. This enables us to jointly train the network and correct potential errors in the original annotations. The result is an approach that boosts performance of deep networks trained with potentially inaccurate annotations.

11.

Counting People by Estimating People Flows.

Liu, Weizhe; Salzmann, Mathieu; Fua, Pascal.

IEEE Trans Pattern Anal Mach Intell ; 44(11): 8151-8166, 2022 11.

Artículo en Inglés | MEDLINE | ID: mdl-34351854

RESUMEN

Modern methods for counting people in crowded scenes rely on deep networks to estimate people densities in individual images. As such, only very few take advantage of temporal consistency in video sequences, and those that do only impose weak smoothness constraints across consecutive frames. In this paper, we advocate estimating people flows across image locations between consecutive images and inferring the people densities from these flows instead of directly regressing them. This enables us to impose much stronger constraints encoding the conservation of the number of people. As a result, it significantly boosts performance without requiring a more complex architecture. Furthermore, it allows us to exploit the correlation between people flow and optical flow to further improve the results. We also show that leveraging people conservation constraints in both a spatial and temporal manner makes it possible to train a deep crowd counting model in an active learning setting with much fewer annotations. This significantly reduces the annotation cost while still leading to similar performance to the full supervision case.

Asunto(s)

Algoritmos , Aglomeración , Humanos

12.

Robust Differentiable SVD.

Wang, Wei; Dang, Zheng; Hu, Yinlin; Fua, Pascal; Salzmann, Mathieu.

IEEE Trans Pattern Anal Mach Intell ; 44(9): 5472-5487, 2022 Sep.

Artículo en Inglés | MEDLINE | ID: mdl-33844626

RESUMEN

Eigendecomposition of symmetric matrices is at the heart of many computer vision algorithms. However, the derivatives of the eigenvectors tend to be numerically unstable, whether using the SVD to compute them analytically or using the Power Iteration (PI) method to approximate them. This instability arises in the presence of eigenvalues that are close to each other. This makes integrating eigendecomposition into deep networks difficult and often results in poor convergence, particularly when dealing with large matrices. While this can be mitigated by partitioning the data into small arbitrary groups, doing so has no theoretical basis and makes it impossible to exploit the full power of eigendecomposition. In previous work, we mitigated this using SVD during the forward pass and PI to compute the gradients during the backward pass. However, the iterative deflation procedure required to compute multiple eigenvectors using PI tends to accumulate errors and yield inaccurate gradients. Here, we show that the Taylor expansion of the SVD gradient is theoretically equivalent to the gradient obtained using PI without relying in practice on an iterative process and thus yields more accurate gradients. We demonstrate the benefits of this increased accuracy for image classification and style transfer.

13.

3D reconstruction of curvilinear structures with stereo matching deep convolutional neural networks.

Altingövde, Okan; Mishchuk, Anastasiia; Ganeeva, Gulnaz; Oveisi, Emad; Hebert, Cecile; Fua, Pascal.

Ultramicroscopy ; 234: 113460, 2022 Apr.

Artículo en Inglés | MEDLINE | ID: mdl-35121280

RESUMEN

Curvilinear structures frequently appear in microscopy imaging as the object of interest. Crystallographic defects, i.e dislocations, are one of the curvilinear structures that have been repeatedly investigated under transmission electron microscopy (TEM) and their 3D structural information is of great importance for understanding the properties of materials. 3D information of dislocations is often obtained by tomography which is a cumbersome process since it is required to acquire many images with different tilt angles and similar imaging conditions. Although, alternative stereoscopy methods lower the number of required images to two, they still require human intervention and shape priors for accurate 3D estimation. We propose a fully automated pipeline for both detection and matching of curvilinear structures in stereo pairs by utilizing deep convolutional neural networks (CNNs) without making any prior assumption on 3D shapes. In this work, we mainly focus on 3D reconstruction of dislocations from stereo pairs of TEM images.

14.

GarNet++: Improving Fast and Accurate Static 3D Cloth Draping by Curvature Loss.

Gundogdu, Erhan; Constantin, Victor; Parashar, Shaifali; Seifoddini, Amrollah; Dang, Minh; Salzmann, Mathieu; Fua, Pascal.

IEEE Trans Pattern Anal Mach Intell ; 44(1): 181-195, 2022 01.

Artículo en Inglés | MEDLINE | ID: mdl-32750825

RESUMEN

In this paper, we tackle the problem of static 3D cloth draping on virtual human bodies. We introduce a two-stream deep network model that produces a visually plausible draping of a template cloth on virtual 3D bodies by extracting features from both the body and garment shapes. Our network learns to mimic a physics-based simulation (PBS) method while requiring two orders of magnitude less computation time. To train the network, we introduce loss terms inspired by PBS to produce plausible results and make the model collision-aware. To increase the details of the draped garment, we introduce two loss functions that penalize the difference between the curvature of the predicted cloth and PBS. Particularly, we study the impact of mean curvature normal and a novel detail-preserving loss both qualitatively and quantitatively. Our new curvature loss computes the local covariance matrices of the 3D points, and compares the Rayleigh quotients of the prediction and PBS. This leads to more details while performing favorably or comparably against the loss that considers mean curvature normal vectors in the 3D triangulated meshes. We validate our framework on four garment types for various body shapes and poses. Finally, we achieve superior performance against a recently proposed data-driven method.

Asunto(s)

Algoritmos , Simulación por Computador , Humanos

15.

Promoting Connectivity of Network-Like Structures by Enforcing Region Separation.

Oner, Doruk; Kozinski, Mateusz; Citraro, Leonardo; Dadap, Nathan C; Konings, Alexandra G; Fua, Pascal.

IEEE Trans Pattern Anal Mach Intell ; 44(9): 5401-5413, 2022 Sep.

Artículo en Inglés | MEDLINE | ID: mdl-33881988

RESUMEN

We propose a novel, connectivity-oriented loss function for training deep convolutional networks to reconstruct network-like structures, like roads and irrigation canals, from aerial images. The main idea behind our loss is to express the connectivity of roads, or canals, in terms of disconnections that they create between background regions of the image. In simple terms, a gap in the predicted road causes two background regions, that lie on the opposite sides of a ground truth road, to touch in prediction. Our loss function is designed to prevent such unwanted connections between background regions, and therefore close the gaps in predicted roads. It also prevents predicting false positive roads and canals by penalizing unwarranted disconnections of background regions. In order to capture even short, dead-ending road segments, we evaluate the loss in small image crops. We show, in experiments on two standard road benchmarks and a new data set of irrigation canals, that convnets trained with our loss function recover road connectivity so well that it suffices to skeletonize their output to produce state of the art maps. A distinct advantage of our approach is that the loss can be plugged in to any existing training setup without further modifications.

16.

Self-Supervised Human Detection and Segmentation via Background Inpainting.

Katircioglu, Isinsu; Rhodin, Helge; Constantin, Victor; Sporri, Jorg; Salzmann, Mathieu; Fua, Pascal.

IEEE Trans Pattern Anal Mach Intell ; 44(12): 9574-9588, 2022 12.

Artículo en Inglés | MEDLINE | ID: mdl-34714741

RESUMEN

While supervised object detection and segmentation methods achieve impressive accuracy, they generalize poorly to images whose appearance significantly differs from the data they have been trained on. To address this when annotating data is prohibitively expensive, we introduce a self-supervised detection and segmentation approach that can work with single images captured by a potentially moving camera. At the heart of our approach lies the observation that object segmentation and background reconstruction are linked tasks, and that, for structured scenes, background regions can be re-synthesized from their surroundings, whereas regions depicting the moving object cannot. We encode this intuition into a self-supervised loss function that we exploit to train a proposal-based segmentation network. To account for the discrete nature of the proposals, we develop a Monte Carlo-based training strategy that allows the algorithm to explore the large space of object proposals. We apply our method to human detection and segmentation in images that visually depart from those of standard benchmarks and outperform existing self-supervised methods.

Asunto(s)

Procesamiento de Imagen Asistido por Computador , Aprendizaje Automático Supervisado , Humanos , Procesamiento de Imagen Asistido por Computador/métodos , Algoritmos

17.

Matching Seqlets: An Unsupervised Approach for Locality Preserving Sequence Matching.

Qiu, Jiayan; Wang, Xinchao; Fua, Pascal; Tao, Dacheng.

IEEE Trans Pattern Anal Mach Intell ; 43(2): 745-752, 2021 02.

Artículo en Inglés | MEDLINE | ID: mdl-31425018

RESUMEN

In this paper, we propose a novel unsupervised approach for sequence matching by explicitly accounting for the locality properties in the sequences. In contrast to conventional approaches that rely on frame-to-frame matching, we conduct matching using sequencelet or seqlet, a sub-sequence wherein the frames share strong similarities and are thus grouped together. The optimal seqlets and matching between them are learned jointly, without any supervision from users. The learned seqlets preserve the locality information at the scale of interest and resolve the ambiguities during matching, which are omitted by frame-based matching methods. We show that our proposed approach outperforms the state-of-the-art ones on datasets of different domains including human actions, facial expressions, speech, and character strokes.

18.

Eigendecomposition-Free Training of Deep Networks for Linear Least-Square Problems.

Dang, Zheng; Yi, Kwang Moo; Hu, Yinlin; Wang, Fei; Fua, Pascal; Salzmann, Mathieu.

IEEE Trans Pattern Anal Mach Intell ; 43(9): 3167-3182, 2021 Sep.

Artículo en Inglés | MEDLINE | ID: mdl-32149625

RESUMEN

Many classical Computer Vision problems, such as essential matrix computation and pose estimation from 3D to 2D correspondences, can be tackled by solving a linear least-square problem, which can be done by finding the eigenvector corresponding to the smallest, or zero, eigenvalue of a matrix representing a linear system. Incorporating this in deep learning frameworks would allow us to explicitly encode known notions of geometry, instead of having the network implicitly learn them from data. However, performing eigendecomposition within a network requires the ability to differentiate this operation. While theoretically doable, this introduces numerical instability in the optimization process in practice. In this paper, we introduce an eigendecomposition-free approach to training a deep network whose loss depends on the eigenvector corresponding to a zero eigenvalue of a matrix predicted by the network. We demonstrate that our approach is much more robust than explicit differentiation of the eigendecomposition using two general tasks, outlier rejection and denoising, with several practical examples including wide-baseline stereo, the perspective-n-point problem, and ellipse fitting. Empirically, our method has better convergence properties and yields state-of-the-art results.

19.

Joint Segmentation and Path Classification of Curvilinear Structures.

Mosinska, Agata; Kozinski, Mateusz; Fua, Pascal.

IEEE Trans Pattern Anal Mach Intell ; 42(6): 1515-1521, 2020 06.

Artículo en Inglés | MEDLINE | ID: mdl-31180837

RESUMEN

Detection of curvilinear structures in images has long been of interest. One of the most challenging aspects of this problem is inferring the graph representation of the curvilinear network. Most existing delineation approaches first perform binary segmentation of the image and then refine it using either a set of hand-designed heuristics or a separate classifier that assigns likelihood to paths extracted from the pixel-wise prediction. In our work, we bridge the gap between segmentation and path classification by training a deep network that performs those two tasks simultaneously. We show that this approach is beneficial because it enforces consistency across the whole processing pipeline. We apply our approach on roads and neurons datasets.

20.

Tracing in 2D to reduce the annotation effort for 3D deep delineation of linear structures.

Kozinski, Mateusz; Mosinska, Agata; Salzmann, Mathieu; Fua, Pascal.

Med Image Anal ; 60: 101590, 2020 02.

Artículo en Inglés | MEDLINE | ID: mdl-31841949

RESUMEN

The difficulty of obtaining annotations to build training databases still slows down the adoption of recent deep learning approaches for biomedical image analysis. In this paper, we show that we can train a Deep Net to perform 3D volumetric delineation given only 2D annotations in Maximum Intensity Projections (MIP) of the training volumes. This significantly reduces the annotation time: We conducted a user study that suggests that annotating 2D projections is on average twice as fast as annotating the original 3D volumes. Our technical contribution is a loss function that evaluates a 3D prediction against annotations of 2D projections. It is inspired by space carving, a classical approach to reconstructing complex 3D shapes from arbitrarily-positioned cameras. It can be used to train any deep network with volumetric output, without the need to change the network's architecture. Substituting the loss is all it takes to enable 2D annotations in an existing training setup. In extensive experiments on 3D light microscopy images of neurons and retinal blood vessels, and on Magnetic Resonance Angiography (MRA) brain scans, we show that, when trained on projection annotations, deep delineation networks perform as well as when they are trained using costlier 3D annotations.

Asunto(s)

Procesamiento de Imagen Asistido por Computador/métodos , Imagenología Tridimensional/métodos , Angiografía por Resonancia Magnética , Redes Neurales de la Computación , Encéfalo/irrigación sanguínea , Encéfalo/diagnóstico por imagen , Conjuntos de Datos como Asunto , Aprendizaje Profundo , Humanos , Vasos Retinianos/diagnóstico por imagen

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA