Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Mais filtros











Base de dados
Intervalo de ano de publicação
1.
Ultrasonics ; 123: 106705, 2022 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-35240462

RESUMO

The inspection of sizeable plate-based metal structures such as storage tanks or marine vessel hulls is a significant stake in the industry, which necessitates reliable and time-efficient solutions. Although Lamb waves have been identified as a promising solution for long-range non-destructive testing, and despite the substantial progress made in autonomous navigation and environment sensing, a Lamb-wave-based robotic system for extensive structure monitoring is still lacking. Following previous work on ultrasonic Simultaneous Localization and Mapping (SLAM), we introduce a method to achieve plate geometry inference without prior knowledge of the material propagation properties, which may be lacking during a practical inspection task in challenging and outdoor environments. Our approach combines focalization to adjust the propagation model parameters and beamforming to infer the plate boundaries location by relying directly on acoustic measurements acquired along the mobile unit trajectory. For each candidate model, the focusing ability of the corresponding beamformer is assessed over high-pass filtered beamforming maps to further improve the robustness of the plate geometry estimates. We then recover the optimal space-domain beamformer through a simulated annealing optimization process. We evaluate our method on three sets of experimental data acquired in different conditions and show that accurate plate geometry inference can be achieved without any prior propagation model. Finally, the results show that the optimal beamformer outperforms the beamformer resulting from the predetermined propagation model in non-nominal acquisition conditions.

2.
IEEE Trans Neural Netw Learn Syst ; 30(6): 1831-1840, 2019 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-30387743

RESUMO

Policy evaluation algorithms are essential to reinforcement learning due to their ability to predict the performance of a policy. However, there are two long-standing issues lying in this prediction problem that need to be tackled: off-policy stability and on-policy efficiency. The conventional temporal difference (TD) algorithm is known to perform very well in the on-policy setting, yet is not off-policy stable. On the other hand, the gradient TD and emphatic TD algorithms are off-policy stable, but are not on-policy efficient. This paper introduces novel algorithms that are both off-policy stable and on-policy efficient by using the oblique projection method. The empirical experimental results on various domains validate the effectiveness of the proposed approach.

3.
IEEE Trans Neural Netw Learn Syst ; 28(8): 1814-1826, 2017 08.
Artigo em Inglês | MEDLINE | ID: mdl-27164607

RESUMO

Learning from demonstrations is a paradigm by which an apprentice agent learns a control policy for a dynamic environment by observing demonstrations delivered by an expert agent. It is usually implemented as either imitation learning (IL) or inverse reinforcement learning (IRL) in the literature. On the one hand, IRL is a paradigm relying on the Markov decision processes, where the goal of the apprentice agent is to find a reward function from the expert demonstrations that could explain the expert behavior. On the other hand, IL consists in directly generalizing the expert strategy, observed in the demonstrations, to unvisited states (and it is therefore close to classification, when there is a finite set of possible decisions). While these two visions are often considered as opposite to each other, the purpose of this paper is to exhibit a formal link between these approaches from which new algorithms can be derived. We show that IL and IRL can be redefined in a way that they are equivalent, in the sense that there exists an explicit bijective operator (namely, the inverse optimal Bellman operator) between their respective spaces of solutions. To do so, we introduce the set-policy framework that creates a clear link between the IL and the IRL. As a result, the IL and IRL solutions making the best of both worlds are obtained. In addition, it is a unifying framework from which existing IL and IRL algorithms can be derived and which opens the way for the IL methods able to deal with the environment's dynamics. Finally, the IRL algorithms derived from the set-policy framework are compared with the algorithms belonging to the more common trajectory-matching family. Experiments demonstrate that the set-policy-based algorithms outperform both the standard IRL and IL ones and result in more robust solutions.

4.
IEEE Trans Neural Netw Learn Syst ; 24(6): 845-67, 2013 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-24808468

RESUMO

Reinforcement learning (RL) is a machine learning answer to the optimal control problem. It consists of learning an optimal control policy through interactions with the system to be controlled, the quality of this policy being quantified by the so-called value function. A recurrent subtopic of RL concerns computing an approximation of this value function when the system is too large for an exact representation. This survey reviews state-of-the-art methods for (parametric) value function approximation by grouping them into three main categories: bootstrapping, residual, and projected fixed-point approaches. Related algorithms are derived by considering one of the associated cost functions and a specific minimization method, generally a stochastic gradient descent or a recursive least-squares approach.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA