Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 24
Filtrar
1.
Ecol Lett ; 25(1): 89-100, 2022 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-34725912

RESUMO

Infections early in life can have enduring effects on an organism's development and immunity. In this study, we show that this equally applies to developing 'superorganisms'--incipient social insect colonies. When we exposed newly mated Lasius niger ant queens to a low pathogen dose, their colonies grew more slowly than controls before winter, but reached similar sizes afterwards. Independent of exposure, queen hibernation survival improved when the ratio of pupae to workers was small. Queens that reared fewer pupae before worker emergence exhibited lower pathogen levels, indicating that high brood rearing efforts interfere with the ability of the queen's immune system to suppress pathogen proliferation. Early-life queen pathogen exposure also improved the immunocompetence of her worker offspring, as demonstrated by challenging the workers to the same pathogen a year later. Transgenerational transfer of the queen's pathogen experience to her workforce can hence durably reduce the disease susceptibility of the whole superorganism.


Assuntos
Formigas , Animais , Feminino , Humanos , Reprodução , Estações do Ano , Comportamento Social
2.
Sensors (Basel) ; 22(2)2022 Jan 14.
Artigo em Inglês | MEDLINE | ID: mdl-35062595

RESUMO

The article presents an AI-based fungi species recognition system for a citizen-science community. The system's real-time identification too - FungiVision - with a mobile application front-end, led to increased public interest in fungi, quadrupling the number of citizens collecting data. FungiVision, deployed with a human-in-the-loop, reaches nearly 93% accuracy. Using the collected data, we developed a novel fine-grained classification dataset - Danish Fungi 2020 (DF20) - with several unique characteristics: species-level labels, a small number of errors, and rich observation metadata. The dataset enables the testing of the ability to improve classification using metadata, e.g., time, location, habitat and substrate, facilitates classifier calibration testing and finally allows the study of the impact of the device settings on the classification performance. The continual flow of labelled data supports improvements of the online recognition system. Finally, we present a novel method for the fungi recognition service, based on a Vision Transformer architecture. Trained on DF20 and exploiting available metadata, it achieves a recognition error that is 46.75% lower than the current system. By providing a stream of labeled data in one direction, and an accuracy increase in the other, the collaboration creates a virtuous cycle helping both communities.


Assuntos
Aprendizado Profundo , Aplicativos Móveis , Fungos , Humanos , Micologia , Redes Neurais de Computação
3.
IEEE Trans Pattern Anal Mach Intell ; 46(5): 2758-2769, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-37999969

RESUMO

We present CG-NeRF, a cascade and generalizable neural radiance fields method for view synthesis. Recent generalizing view synthesis methods can render high-quality novel views using a set of nearby input views. However, the rendering speed is still slow due to the nature of uniformly-point sampling of neural radiance fields. Existing scene-specific methods can train and render novel views efficiently but can not generalize to unseen data. Our approach addresses the problems of fast and generalizing view synthesis by proposing two novel modules: a coarse radiance fields predictor and a convolutional-based neural renderer. This architecture infers consistent scene geometry based on the implicit neural fields and renders new views efficiently using a single GPU. We first train CG-NeRF on multiple 3D scenes of the DTU dataset, and the network can produce high-quality and accurate novel views on unseen real and synthetic data using only photometric losses. Moreover, our method can leverage a denser set of reference images of a single scene to produce accurate novel views without relying on additional explicit representations and still maintains the high-speed rendering of the pre-trained model. Experimental results show that CG-NeRF outperforms state-of-the-art generalizable neural rendering methods on various synthetic and real datasets.

4.
IEEE Trans Pattern Anal Mach Intell ; 45(1): 123-136, 2023 01.
Artigo em Inglês | MEDLINE | ID: mdl-35239475

RESUMO

Humans can robustly recognize and localize objects by using visual and/or auditory cues. While machines are able to do the same with visual data already, less work has been done with sounds. This work develops an approach for scene understanding purely based on binaural sounds. The considered tasks include predicting the semantic masks of sound-making objects, the motion of sound-making objects, and the depth map of the scene. To this aim, we propose a novel sensor setup and record a new audio-visual dataset of street scenes with eight professional binaural microphones and a 360 °camera. The co-existence of visual and audio cues is leveraged for supervision transfer. In particular, we employ a cross-modal distillation framework that consists of multiple vision 'teacher' methods and a sound 'student' method - the student method is trained to generate the same results as the teacher methods do. This way, the auditory system can be trained without using human annotations. To further boost the performance, we propose another novel auxiliary task, coined Spatial Sound Super-Resolution, to increase the directional resolution of sounds. We then formulate the four tasks into one end-to-end trainable multi-tasking network aiming to boost the overall performance. Experimental results show that 1) our method achieves good results for all four tasks, 2) the four tasks are mutually beneficial - training them together achieves the best performance, 3) the number and orientation of microphones are both important, and 4) features learned from the standard spectrogram and features obtained by the classic signal processing pipeline are complementary for auditory perception tasks. The data and code are released on the project page: https://www.trace.ethz.ch/publications/2020/sound_perception/index.html.


Assuntos
Algoritmos , Semântica , Humanos , Som , Aprendizagem , Sinais (Psicologia)
5.
IEEE Trans Pattern Anal Mach Intell ; 45(5): 6552-6574, 2023 May.
Artigo em Inglês | MEDLINE | ID: mdl-36215368

RESUMO

Accurate and robust visual object tracking is one of the most challenging and fundamental computer vision problems. It entails estimating the trajectory of the target in an image sequence, given only its initial location, and segmentation, or its rough approximation in the form of a bounding box. Discriminative Correlation Filters (DCFs) and deep Siamese Networks (SNs) have emerged as dominating tracking paradigms, which have led to significant progress. Following the rapid evolution of visual object tracking in the last decade, this survey presents a systematic and thorough review of more than 90 DCFs and Siamese trackers, based on results in nine tracking benchmarks. First, we present the background theory of both the DCF and Siamese tracking core formulations. Then, we distinguish and comprehensively review the shared as well as specific open research challenges in both these tracking paradigms. Furthermore, we thoroughly analyze the performance of DCF and Siamese trackers on nine benchmarks, covering different experimental aspects of visual tracking: datasets, evaluation metrics, performance, and speed comparisons. We finish the survey by presenting recommendations and suggestions for distinguished open challenges based on our analysis.

6.
IEEE Trans Pattern Anal Mach Intell ; 44(9): 4961-4974, 2022 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-33830919

RESUMO

We propose Graph-Cut RANSAC, GC-RANSAC in short, a new robust geometric model estimation method where the local optimization step is formulated as energy minimization with binary labeling, applying the graph-cut algorithm to select inliers. The minimized energy reflects the assumption that geometric data often form spatially coherent structures - it includes both a unary component representing point-to-model residuals and a binary term promoting spatially coherent inlier-outlier labelling of neighboring points. The proposed local optimization step is conceptually simple, easy to implement, efficient with a globally optimal inlier selection given the model parameters. Graph-Cut RANSAC, equipped with "the bells and whistles" of USAC and MAGSAC++, was tested on a range of problems using a number of publicly available datasets for homography, 6D object pose, fundamental and essential matrix estimation. It is more geometrically accurate than state-of-the-art robust estimators, fails less often and runs faster or with speed similar to less accurate alternatives. The source code is available at https://github.com/danini/graph-cut-ransac.

7.
Front Plant Sci ; 13: 787527, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36237508

RESUMO

The article reviews and benchmarks machine learning methods for automatic image-based plant species recognition and proposes a novel retrieval-based method for recognition by nearest neighbor classification in a deep embedding space. The image retrieval method relies on a model trained via the Recall@k surrogate loss. State-of-the-art approaches to image classification, based on Convolutional Neural Networks (CNN) and Vision Transformers (ViT), are benchmarked and compared with the proposed image retrieval-based method. The impact of performance-enhancing techniques, e.g., class prior adaptation, image augmentations, learning rate scheduling, and loss functions, is studied. The evaluation is carried out on the PlantCLEF 2017, the ExpertLifeCLEF 2018, and the iNaturalist 2018 Datasets-the largest publicly available datasets for plant recognition. The evaluation of CNN and ViT classifiers shows a gradual improvement in classification accuracy. The current state-of-the-art Vision Transformer model, ViT-Large/16, achieves 91.15% and 83.54% accuracy on the PlantCLEF 2017 and ExpertLifeCLEF 2018 test sets, respectively; the best CNN model (ResNeSt-269e) error rate dropped by 22.91% and 28.34%. Apart from that, additional tricks increased the performance for the ViT-Base/32 by 3.72% on ExpertLifeCLEF 2018 and by 4.67% on PlantCLEF 2017. The retrieval approach achieved superior performance in all measured scenarios with accuracy margins of 0.28%, 4.13%, and 10.25% on ExpertLifeCLEF 2018, PlantCLEF 2017, and iNat2018-Plantae, respectively.

8.
IEEE Trans Pattern Anal Mach Intell ; 44(11): 8420-8432, 2022 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-34375281

RESUMO

A new method for robust estimation, MAGSAC++, is proposed. It introduces a new model quality (scoring) function that does not make inlier-outlier decisions, and a novel marginalization procedure formulated as an M-estimation with a novel class of M-estimators (a robust kernel) solved by an iteratively re-weighted least squares procedure. Instead of the inlier-outlier threshold, it requires only its loose upper bound which can be chosen from a significantly wider range. Also, we propose a new termination criterion and a technique for selecting a set of inliers in a data-driven manner as a post-processing step after the robust estimation finishes. On a number of publicly available real-world datasets for homography, fundamental matrix fitting and relative pose, MAGSAC++ produces results superior to the state-of-the-art robust methods. It is more geometrically accurate, fails fewer times, and it is often faster. It is shown that MAGSAC++ is significantly less sensitive to the setting of the threshold upper bound than the other state-of-the-art algorithms to the inlier-outlier threshold. Therefore, it is easier to be applied to unseen problems and scenes without acquiring information by hand about the setting of the inlier-outlier threshold. The source code and examples both in C++ and Python are available at https://github.com/danini/magsac.

9.
IEEE Trans Pattern Anal Mach Intell ; 44(12): 9742-9755, 2022 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-34941502

RESUMO

Template-based discriminative trackers are currently the dominant tracking paradigm due to their robustness, but are restricted to bounding box tracking and a limited range of transformation models, which reduces their localization accuracy. We propose a discriminative single-shot segmentation tracker - D3S 2, which narrows the gap between visual object tracking and video object segmentation. A single-shot network applies two target models with complementary geometric properties, one invariant to a broad range of transformations, including non-rigid deformations, the other assuming a rigid object to simultaneously achieve robust online target segmentation. The overall tracking reliability is further increased by decoupling the object and feature scale estimation. Without per-dataset finetuning, and trained only for segmentation as the primary output, D3S 2 outperforms all published trackers on the recent short-term tracking benchmark VOT2020 and performs very close to the state-of-the-art trackers on the GOT-10k, TrackingNet, OTB100 and LaSoT. D3S 2 outperforms the leading segmentation tracker SiamMask on video object segmentation benchmarks and performs on par with top video object segmentation algorithms.

10.
IEEE Trans Cybern ; 51(12): 6305-6318, 2021 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-32248144

RESUMO

A long-term visual object tracking performance evaluation methodology and a benchmark are proposed. Performance measures are designed by following a long-term tracking definition to maximize the analysis probing strength. The new measures outperform existing ones in interpretation potential and in better distinguishing between different tracking behaviors. We show that these measures generalize the short-term performance measures, thus linking the two tracking problems. Furthermore, the new measures are highly robust to temporal annotation sparsity and allow annotation of sequences hundreds of times longer than in the current datasets without increasing manual annotation labor. A new challenging dataset of carefully selected sequences with many target disappearances is proposed. A new tracking taxonomy is proposed to position trackers on the short-term/long-term spectrum. The benchmark contains an extensive evaluation of the largest number of long-term trackers and comparison to state-of-the-art short-term trackers. We analyze the influence of tracking architecture implementations to long-term performance and explore various redetection strategies as well as the influence of visual model update strategies to long-term tracking drift. The methodology is integrated in the VOT toolkit to automate experimental analysis and benchmarking and to facilitate the future development of long-term trackers.


Assuntos
Algoritmos , Gravação em Vídeo
11.
Artigo em Inglês | MEDLINE | ID: mdl-32813657

RESUMO

If an object is photographed at motion in front of a static background, the object will be blurred while the background sharp and partially occluded by the object. The goal is to recover the object appearance from such blurred image. We adopt the image formation model for fast moving objects and consider objects undergoing 2D translation and rotation. For this scenario we formulate the estimation of the object shape, appearance, and motion from a single image and known background as a constrained optimization problem with appropriate regularization terms. Both similarities and differences with blind deconvolution are discussed with the latter caused mainly by the coupling of the object appearance and shape in the acquisition model. Necessary conditions for solution uniqueness are derived and a numerical solution based on the alternating direction method of multipliers is presented. The proposed method is evaluated on a new dataset.

12.
IEEE Trans Pattern Anal Mach Intell ; 42(11): 2825-2841, 2020 11.
Artigo em Inglês | MEDLINE | ID: mdl-31094682

RESUMO

In this paper, a novel benchmark is introduced for evaluating local image descriptors. We demonstrate limitations of the commonly used datasets and evaluation protocols, that lead to ambiguities and contradictory results in the literature. Furthermore, these benchmarks are nearly saturated due to the recent improvements in local descriptors obtained by learning from large annotated datasets. To address these issues, we introduce a new large dataset suitable for training and testing modern descriptors, together with strictly defined evaluation protocols in several tasks such as matching, retrieval and verification. This allows for more realistic, thus more reliable comparisons in different application scenarios. We evaluate the performance of several state-of-the-art descriptors and analyse their properties. We show that a simple normalisation of traditional hand-crafted descriptors is able to boost their performance to the level of deep learning based descriptors once realistic benchmarks are considered. Additionally we specify a protocol for learning and evaluating using cross validation. We show that when training state-of-the-art descriptors on this dataset, the traditional verification task is almost entirely saturated.


Assuntos
Aprendizado Profundo , Processamento de Imagem Assistida por Computador/métodos , Algoritmos
13.
IEEE Trans Pattern Anal Mach Intell ; 31(4): 677-92, 2009 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-19229083

RESUMO

We propose a learning approach to tracking explicitly minimizing the computational complexity of the tracking process subject to user-defined probability of failure (loss-of-lock) and precision. The tracker is formed by a Number of Sequences of Learned Linear Predictors (NoSLLiP). Robustness of NoSLLiP is achieved by modeling the object as a collection of local motion predictors--object motion is estimated by the outlier-tolerant RANSAC algorithm from local predictions. Efficiency of the NoSLLiP tracker stems from (i) the simplicity of the local predictors and (ii) from the fact that all design decisions--the number of local predictors used by the tracker, their computational complexity (i.e. the number of observations the prediction is based on), locations as well as the number of RANSAC iterations are all subject to the optimization (learning) process. All time-consuming operations are performed during the learning stage--tracking is reduced to only a few hundreds integer multiplications in each step. On PC with 1xK8 3200+, a predictor evaluation requires about 30 microseconds. The proposed approach is verified on publicly-available sequences with approximately 12000 frames with ground-truth. Experiments demonstrates, superiority in frame rates and robustness with respect to the SIFT detector, Lucas-Kanade tracker and other trackers.

14.
IEEE Trans Pattern Anal Mach Intell ; 30(8): 1472-82, 2008 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-18566499

RESUMO

A randomized model verification strategy for RANSAC is presented. The proposed method finds, like RANSAC, a solution that is optimal with user-specified probability. The solution is found in time that is (i) close to the shortest possible and (ii) superior to any deterministic verification strategy. A provably fastest model verification strategy is designed for the (theoretical) situation when the contamination of data by outliers is known. In this case, the algorithm is the fastest possible (on average) of all randomized \\RANSAC algorithms guaranteeing a confidence in the solution. The derivation of the optimality property is based on Wald's theory of sequential decision making, in particular a modified sequential probability ratio test (SPRT). Next, the R-RANSAC with SPRT algorithm is introduced. The algorithm removes the requirement for a priori knowledge of the fraction of outliers and estimates the quantity online. We show experimentally that on standard test data the method has performance close to the theoretically optimal and is 2 to 10 times faster than standard RANSAC and is up to 4 times faster than previously published methods.


Assuntos
Algoritmos , Inteligência Artificial , Interpretação Estatística de Dados , Interpretação de Imagem Assistida por Computador/métodos , Reconhecimento Automatizado de Padrão/métodos , Software , Análise por Conglomerados , Simulação por Computador , Modelos Estatísticos
15.
Plant Methods ; 13: 115, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-29299049

RESUMO

BACKGROUND: Fine-grained recognition of plants from images is a challenging computer vision task, due to the diverse appearance and complex structure of plants, high intra-class variability and small inter-class differences. We review the state-of-the-art and discuss plant recognition tasks, from identification of plants from specific plant organs to general plant recognition "in the wild". RESULTS: We propose texture analysis and deep learning methods for different plant recognition tasks. The methods are evaluated and compared them to the state-of-the-art. Texture analysis is only applied to images with unambiguous segmentation (bark and leaf recognition), whereas CNNs are only applied when sufficiently large datasets are available. The results provide an insight in the complexity of different plant recognition tasks. The proposed methods outperform the state-of-the-art in leaf and bark classification and achieve very competitive results in plant recognition "in the wild". CONCLUSIONS: The results suggest that recognition of segmented leaves is practically a solved problem, when high volumes of training data are available. The generality and higher capacity of state-of-the-art CNNs makes them suitable for plant recognition "in the wild" where the views on plant organs or plants vary significantly and the difficulty is increased by occlusions and background clutter.

16.
IEEE Trans Pattern Anal Mach Intell ; 38(9): 1872-85, 2016 09.
Artigo em Inglês | MEDLINE | ID: mdl-26540676

RESUMO

An end-to-end real-time text localization and recognition method is presented. Its real-time performance is achieved by posing the character detection and segmentation problem as an efficient sequential selection from the set of Extremal Regions. The ER detector is robust against blur, low contrast and illumination, color and texture variation. In the first stage, the probability of each ER being a character is estimated using features calculated by a novel algorithm in constant time and only ERs with locally maximal probability are selected for the second stage, where the classification accuracy is improved using computationally more expensive features. A highly efficient clustering algorithm then groups ERs into text lines and an OCR classifier trained on synthetic fonts is exploited to label character regions. The most probable character sequence is selected in the last stage when the context of each character is known. The method was evaluated on three public datasets. On the ICDAR 2013 dataset the method achieves state-of-the-art results in text localization; on the more challenging SVT dataset, the proposed method significantly outperforms the state-of-the-art methods and demonstrates that the proposed pipeline can incorporate additional prior knowledge about the detected text. The proposed method was exploited as the baseline in the ICDAR 2015 Robust Reading competition, where it compares favourably to the state-of-the art.

17.
IEEE Trans Image Process ; 25(1): 359-71, 2016 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-26552087

RESUMO

Long-term tracking of an object, given only a single instance in an initial frame, remains an open problem. We propose a visual tracking algorithm, robust to many of the difficulties that often occur in real-world scenes. Correspondences of edge-based features are used, to overcome the reliance on the texture of the tracked object and improve invariance to lighting. Furthermore, we address long-term stability, enabling the tracker to recover from drift and to provide redetection following object disappearance or occlusion. The two-module principle is similar to the successful state-of-the-art long-term TLD tracker; however, our approach offers better performance in benchmarks and extends to cases of low-textured objects. This becomes obvious in cases of plain objects with no texture at all, where the edge-based approach proves the most beneficial. We perform several different experiments to validate the proposed method. First, results on short-term sequences show the performance of tracking challenging (low textured and/or transparent) objects that represent failure cases for competing the state-of-the-art approaches. Second, long sequences are tracked, including one of almost 30 000 frames, which, to the best of our knowledge, is the longest tracking sequence reported to date. This tests the redetection and drift resistance properties of the tracker. Finally, we report the results of the proposed tracker on the VOT Challenge 2013 and 2014 data sets as well as on the VTB1.0 benchmark, and we show relative performance of the tracker compared with its competitors. All the results are comparable with the state of the art on sequences with textured objects and superior on non-textured objects. The new annotated sequences are made publicly available.

18.
IEEE Trans Pattern Anal Mach Intell ; 38(11): 2137-2155, 2016 11.
Artigo em Inglês | MEDLINE | ID: mdl-26766217

RESUMO

This paper addresses the problem of single-target tracker performance evaluation. We consider the performance measures, the dataset and the evaluation system to be the most important components of tracker evaluation and propose requirements for each of them. The requirements are the basis of a new evaluation methodology that aims at a simple and easily interpretable tracker comparison. The ranking-based methodology addresses tracker equivalence in terms of statistical significance and practical differences. A fully-annotated dataset with per-frame annotations with several visual attributes is introduced. The diversity of its visual properties is maximized in a novel way by clustering a large number of videos according to their visual attributes. This makes it the most sophistically constructed and annotated dataset to date. A multi-platform evaluation system allowing easy integration of third-party trackers is presented as well. The proposed evaluation methodology was tested on the VOT2014 challenge on the new dataset and 38 trackers, making it the largest benchmark to date. Most of the tested trackers are indeed state-of-the-art since they outperform the standard baselines, resulting in a highly-challenging benchmark. An exhaustive analysis of the dataset from the perspective of tracking difficulty is carried out. To facilitate tracker comparison a new performance visualization technique is proposed.

19.
IEEE Trans Pattern Anal Mach Intell ; 35(8): 2022-38, 2013 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-23787350

RESUMO

A computational problem that arises frequently in computer vision is that of estimating the parameters of a model from data that have been contaminated by noise and outliers. More generally, any practical system that seeks to estimate quantities from noisy data measurements must have at its core some means of dealing with data contamination. The random sample consensus (RANSAC) algorithm is one of the most popular tools for robust estimation. Recent years have seen an explosion of activity in this area, leading to the development of a number of techniques that improve upon the efficiency and robustness of the basic RANSAC algorithm. In this paper, we present a comprehensive overview of recent research in RANSAC-based robust estimation by analyzing and comparing various approaches that have been explored over the years. We provide a common context for this analysis by introducing a new framework for robust estimation, which we call Universal RANSAC (USAC). USAC extends the simple hypothesize-and-verify structure of standard RANSAC to incorporate a number of important practical and computational considerations. In addition, we provide a general-purpose C++ software library that implements the USAC framework by leveraging state-of-the-art algorithms for the various modules. This implementation thus addresses many of the limitations of standard RANSAC within a single unified package. We benchmark the performance of the algorithm on a large collection of estimation problems. The implementation we provide can be used by researchers either as a stand-alone tool for robust estimation or as a benchmark for evaluating new techniques.

20.
IEEE Trans Pattern Anal Mach Intell ; 34(7): 1409-22, 2012 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-22156098

RESUMO

This paper investigates long-term tracking of unknown objects in a video stream. The object is defined by its location and extent in a single frame. In every frame that follows, the task is to determine the object's location and extent or indicate that the object is not present. We propose a novel tracking framework (TLD) that explicitly decomposes the long-term tracking task into tracking, learning, and detection. The tracker follows the object from frame to frame. The detector localizes all appearances that have been observed so far and corrects the tracker if necessary. The learning estimates the detector's errors and updates it to avoid these errors in the future. We study how to identify the detector's errors and learn from them. We develop a novel learning method (P-N learning) which estimates the errors by a pair of "experts": (1) P-expert estimates missed detections, and (2) N-expert estimates false alarms. The learning process is modeled as a discrete dynamical system and the conditions under which the learning guarantees improvement are found. We describe our real-time implementation of the TLD framework and the P-N learning. We carry out an extensive quantitative evaluation which shows a significant improvement over state-of-the-art approaches.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA