Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 53
Filtrar
Más filtros

Banco de datos
Tipo del documento
País de afiliación
Intervalo de año de publicación
1.
Sensors (Basel) ; 24(12)2024 Jun 17.
Artículo en Inglés | MEDLINE | ID: mdl-38931716

RESUMEN

Aiming at the problems of the poor robustness and universality of traditional contour matching algorithms in engineering applications, a method for improving the surface defect detection of industrial products based on contour matching algorithms is detailed in this paper. Based on the image pyramid optimization method, a three-level matching method is designed, which can quickly obtain the candidate pose of the target contour at the top of the image pyramid, combining the integral graph and the integration graph acceleration strategy based on weak classification. It can quickly obtain the rough positioning and rough angle of the target contour, which greatly improves the performance of the algorithm. In addition, to solve the problem that a large number of duplicate candidate points will be generated when the target candidate points are expanded, a method to obtain the optimal candidate points in the neighborhood of the target candidate points is designed, which can guarantee the matching accuracy and greatly reduce the calculation amount. In order to verify the effectiveness of the algorithm, functional test experiments were designed for template building function and contour matching function, including uniform illumination condition, nonlinear condition and contour matching detection under different conditions. The results show that: (1) Under uniform illumination conditions, the detection accuracy can be maintained at about 93%. (2) Under nonlinear illumination conditions, the detection accuracy can be maintained at about 91.84%. (3) When there is an external interference source, there will be a false detection or no detection, and the overall defect detection rate remains above 94%. It is verified that the proposed method can meet the application requirements of common defect detection, and has good robustness and meets the expected functional requirements of the algorithm, providing a strong technical guarantee and data support for the design of embedded image sensors in the later stage.

2.
Neural Comput ; 35(5): 958-976, 2023 Apr 18.
Artículo en Inglés | MEDLINE | ID: mdl-36944244

RESUMEN

Visual navigation involves a movable robotic agent striving to reach a point goal (target location) using vision sensory input. While navigation with ideal visibility has seen plenty of success, it becomes challenging in suboptimal visual conditions like poor illumination, where traditional approaches suffer from severe performance degradation. We propose E3VN (echo-enhanced embodied visual navigation) to effectively perceive the surroundings even under poor visibility to mitigate this problem. This is made possible by adopting an echoer that actively perceives the environment via auditory signals. E3VN models the robot agent as playing a cooperative Markov game with that echoer. The action policies of robot and echoer are jointly optimized to maximize the reward in a two-stream actor-critic architecture. During optimization, the reward is also adaptively decomposed into the robot and echoer parts. Our experiments and ablation studies show that E3VN is consistently effective and robust in point goal navigation tasks, especially under nonideal visibility.

3.
Entropy (Basel) ; 25(4)2023 Apr 14.
Artículo en Inglés | MEDLINE | ID: mdl-37190445

RESUMEN

Autonomous indoor service robots are affected by multiple factors when they are directly involved in manipulation tasks in daily life, such as scenes, objects, and actions. It is of self-evident importance to properly parse these factors and interpret intentions according to human cognition and semantics. In this study, the design of a semantic representation framework based on a knowledge graph is presented, including (1) a multi-layer knowledge-representation model, (2) a multi-module knowledge-representation system, and (3) a method to extract manipulation knowledge from multiple sources of information. Moreover, with the aim of generating semantic representations of entities and relations in the knowledge base, a knowledge-graph-embedding method based on graph convolutional neural networks is proposed in order to provide high-precision predictions of factors in manipulation tasks. Through the prediction of action sequences via this embedding method, robots in real-world environments can be effectively guided by the knowledge framework to complete task planning and object-oriented transfer.

4.
IEEE Trans Cybern ; 54(5): 2784-2797, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-37713227

RESUMEN

Robotic rigid contact-rich manipulation in an unstructured dynamic environment requires an effective resolution for smart manufacturing. As the most common use case for the intelligence industry, a lot of studies based on reinforcement learning (RL) algorithms have been conducted to improve the performances of single peg-in-hole assembly. However, existing RL methods are difficult to apply to multiple peg-in-hole issues due to more complicated geometric and physical constraints. In addition, previously limited solutions for multiple peg-in-hole assembly are hard to transfer into real industrial scenarios flexibly. To effectively address these issues, this work designs a novel and more challenging multiple peg-in-hole assembly setup by using the advantage of the Industrial Metaverse. We propose a detailed solution scheme to solve this task. Specifically, multiple modalities, including vision, proprioception, and force/torque, are learned as compact representations to account for the complexity and uncertainties and improve the sample efficiency. Furthermore, RL is used in the simulation to train the policy, and the learned policy is transferred to the real world without extra exploration. Domain randomization and impedance control are embedded into the policy to narrow the gap between simulation and reality. Evaluation results demonstrate the effectiveness of the proposed solution, showcasing successful multiple peg-in-hole assembly and generalization across different object shapes in real-world scenarios.

5.
Neural Netw ; 172: 106075, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38278092

RESUMEN

The SSVEP-based paradigm serves as a prevalent approach in the realm of brain-computer interface (BCI). However, the processing of multi-channel electroencephalogram (EEG) data introduces challenges due to its non-Euclidean characteristic, necessitating methodologies that account for inter-channel topological relations. In this paper, we introduce the Dynamic Decomposition Graph Convolutional Neural Network (DDGCNN) designed for the classification of SSVEP EEG signals. Our approach incorporates layerwise dynamic graphs to address the oversmoothing issue in Graph Convolutional Networks (GCNs), employing a dense connection mechanism to mitigate the gradient vanishing problem. Furthermore, we enhance the traditional linear transformation inherent in GCNs with graph dynamic fusion, thereby elevating feature extraction and adaptive aggregation capabilities. Our experimental results demonstrate the effectiveness of proposed approach in learning and extracting features from EEG topological structure. The results shown that DDGCNN outperforms other state-of-the-art (SOTA) algorithms reported on two datasets (Dataset 1: 54 subjects, 4 targets, 2 sessions; Dataset 2: 35 subjects, 40 targets). Additionally, we showcase the implementation of DDGCNN in the context of synchronized BCI robotic fish control. This work represents a significant advancement in the field of EEG signal processing for SSVEP-based BCIs. Our proposed method processes SSVEP time domain signals directly as an end-to-end system, making it easy to deploy. The code is available at https://github.com/zshubin/DDGCNN.


Asunto(s)
Interfaces Cerebro-Computador , Humanos , Potenciales Evocados Visuales , Redes Neurales de la Computación , Algoritmos , Electroencefalografía/métodos , Estimulación Luminosa
6.
Artículo en Inglés | MEDLINE | ID: mdl-38190682

RESUMEN

The label transition matrix has emerged as a widely accepted method for mitigating label noise in machine learning. In recent years, numerous studies have centered on leveraging deep neural networks to estimate the label transition matrix for individual instances within the context of instance-dependent noise. However, these methods suffer from low search efficiency due to the large space of feasible solutions. Behind this drawback, we have explored that the real murderer lies in the invalid class transitions, that is, the actual transition probability between certain classes is zero but is estimated to have a certain value. To mask the invalid class transitions, we introduced a human-cognition-assisted method with structural information from human cognition. Specifically, we introduce a structured transition matrix network (STMN) designed with an adversarial learning process to balance instance features and prior information from human cognition. The proposed method offers two advantages: 1) better estimation effectiveness is obtained by sparing the transition matrix and 2) better estimation accuracy is obtained with the assistance of human cognition. By exploiting these two advantages, our method parametrically estimates a sparse label transition matrix, effectively converting noisy labels into true labels. The efficiency and superiority of our proposed method are substantiated through comprehensive comparisons with state-of-the-art methods on three synthetic datasets and a real-world dataset. Our code will be available at https://github.com/WheatCao/STMN-Pytorch.

7.
Artículo en Inglés | MEDLINE | ID: mdl-38300770

RESUMEN

Hierarchical reinforcement learning (HRL) exhibits remarkable potential in addressing large-scale and long-horizon complex tasks. However, a fundamental challenge, which arises from the inherently entangled nature of hierarchical policies, has not been understood well, consequently compromising the training stability and exploration efficiency of HRL. In this article, we propose a novel HRL algorithm, high-level model approximation (HLMA), presenting both theoretical foundations and practical implementations. In HLMA, a Planner constructs an innovative high-level dynamic model to predict the k -step transition of the Controller in a subtask. This allows for the estimation of the evolving performance of the Controller. At low level, we leverage the initial state of each subtask, transforming absolute states into relative deviations by a designed operator as Controller input. This approach facilitates the reuse of subtask domain knowledge, enhancing data efficiency. With this designed structure, we establish the local convergence of each component within HLMA and subsequently derive regret bounds to ensure global convergence. Abundant experiments conducted on complex locomotion and navigation tasks demonstrate that HLMA surpasses other state-of-the-art single-level RL and HRL algorithms in terms of sample efficiency and asymptotic performance. In addition, thorough ablation studies validate the effectiveness of each component of HLMA.

8.
Adv Mater ; : e2403830, 2024 Jun 07.
Artículo en Inglés | MEDLINE | ID: mdl-38848548

RESUMEN

Flexoelectricity features the strain gradient-induced mechanoelectric conversion using materials not limited by their crystalline symmetry, but state-of-the-art flexoelectric materials exhibit very small flexoelectric coefficients and are too brittle to withstand large deformations. Here, inspired by the ion polarization in living organisms, this paper reports the giant iontronic flexoelectricity of soft hydrogels where the ion polarization is attributed to the different transfer rates of cations and anions under bending deformations. The flexoelectricity is found to be easily regulated by the types of anion-cation pairs and polymer networks in the hydrogel. A polyacrylamide hydrogel with 1 m NaCl achieves a record-high flexoelectric coefficient of ≈1160 µC m-1, which can even be improved to ≈2340 µC m-1 by synergizing with the effects of ion pairs and extra polycation chains. Furthermore, the hydrogel as flexoelectric materials can withstand larger bending deformations to obtain higher polarization charges owing to its intrinsic low modulus and high elasticity. A soft flexoelectric sensor is then demonstrated for object recognition by robotic hands. The findings greatly broaden the flexoelectricity to soft, biomimetic, and biocompatible materials and applications.

9.
Soft Robot ; 11(3): 508-518, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38386776

RESUMEN

Teleoperation in soft robotics can endow soft robots with the ability to perform complex tasks through human-robot interaction. In this study, we propose a teleoperated anthropomorphic soft robot hand with variable degrees of freedom (DOFs) and a metamorphic palm. The soft robot hand consists of four pneumatic-actuated fingers, which can be heated to tune stiffness. A metamorphic mechanism was actuated to morph the hand palm by servo motors. The human fingers' DOF, gesture, and muscle stiffness were collected and mapped to the soft robotic hand through the sensory feedback from surface electromyography devices on the jib. The results show that the proposed soft robot hand can generate a variety of anthropomorphic configurations and can be remotely controlled to perform complex tasks such as primitively operating the cell phone and placing the building blocks. We also show that the soft hand can grasp a target through the slit by varying the DOFs and stiffness in a trail.


Asunto(s)
Dedos , Mano , Robótica , Robótica/instrumentación , Humanos , Dedos/fisiología , Mano/fisiología , Diseño de Equipo , Fuerza de la Mano/fisiología , Electromiografía
10.
Artículo en Inglés | MEDLINE | ID: mdl-38941209

RESUMEN

Knowledge graph reasoning (KGR), aiming to deduce new facts from existing facts based on mined logic rules underlying knowledge graphs (KGs), has become a fast-growing research direction. It has been proven to significantly benefit the usage of KGs in many AI applications, such as question answering, recommendation systems, and etc. According to the graph types, existing KGR models can be roughly divided into three categories, i.e., static models, temporal models, and multi-modal models. Early works in this domain mainly focus on static KGR, and recent works try to leverage the temporal and multi-modal information, which are more practical and closer to real-world. However, no survey papers and open-source repositories comprehensively summarize and discuss models in this important direction. To fill the gap, we conduct a first survey for knowledge graph reasoning tracing from static to temporal and then to multi-modal KGs. Concretely, the models are reviewed based on bi-level taxonomy, i.e., top-level (graph types) and base-level (techniques and scenarios). Besides, the performances, as well as datasets, are summarized and presented. Moreover, we point out the challenges and potential opportunities to enlighten the readers. The corresponding open-source repository is shared on GitHub https://github.com/LIANGKE23/Awesome-Knowledge-Graph-Reasoning.

11.
IEEE Trans Cybern ; PP2024 Apr 05.
Artículo en Inglés | MEDLINE | ID: mdl-38578861

RESUMEN

The utilization of robots in computer, communication, and consumer electronics (3C) assembly has the potential to significantly reduce labor costs and enhance assembly efficiency. However, many typical scenarios in 3C assembly, such as the assembly of flexible printed circuits (FPCs), involve complex manipulations with long-horizon steps and high-precision requirements that cannot be effectively accomplished through manual programming or conventional skill-learning methods. To address this challenge, this article proposes a learning-based framework for the acquisition of complex 3C assembly skills assisted by a multimodal digital-twin environment. First, we construct a fully equivalent digital-twin environment based on the real-world counterpart, equipped with visual, tactile force, and proprioception information, and then collect multimodal demonstration data using virtual reality (VR) devices. Next, we construct a skill knowledge base through multimodal skill parsing of demonstration data, resulting in primitive policy sequences for achieving 3C assembly tasks. Finally, we train primitive policies via a combination of curriculum learning, residual reinforcement learning, and domain randomization methods and transfer the learned skill from the digital-twin environment to the real-world environment. The experiments are conducted to verify the effectiveness of our proposed method.

12.
IEEE Trans Neural Netw Learn Syst ; 34(9): 5926-5936, 2023 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-34932488

RESUMEN

This article studies the robust intelligent control for the longitudinal dynamics of flexible hypersonic flight vehicle with input dead zone. Considering the different time-scale characteristics among the system states, the singular perturbation decomposition is employed to transform the rigid-elastic coupling model into the slow dynamics and the fast dynamics. For the slow dynamics with unknown system nonlinearities, the robust neural control is constructed using the switching mechanism to achieve the coordination between robust design and neural learning. For the time-varying control gain caused by unknown dead-zone input, the stable control is presented with an adaptive estimation design. For the fast dynamics, the sliding mode control is constructed to make the elastic modes stable and convergent. The elevator deflection is obtained by combining the two control signals. The stability of the dynamics is analyzed through the Lyapunov approach and the system tracking errors are bounded. The simulation is conducted to demonstrate the effectiveness of the proposed approach.

13.
IEEE Trans Neural Netw Learn Syst ; 34(10): 7567-7577, 2023 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-35157591

RESUMEN

This article investigates the robust adaptive learning control for space robots with target capturing. Based on the momentum conservation theory, the impact dynamics is constructed to derive the relationship of generalized velocity in the pre-impact and post-impact phase. Considering the nonlinear dynamics with contact impact, the robust control using nonsingular terminal sliding mode (NTSM) and fast NTSM is designed to achieve the fast realization of the desired states. Furthermore, for the unknown dynamics of the combination system after capturing a target, the adaptive learning control is developed based on neural network and disturbance observer. Through the serial-parallel estimation model, the prediction error is constructed for the update of adaptive law. The system signals involved in the Lyapunov function are proved to be bounded and the sliding mode surface converges in finite time. Simulation studies present the desired tracking and learning performance.

14.
IEEE Trans Neural Netw Learn Syst ; 34(9): 5452-5463, 2023 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-35767493

RESUMEN

Multifingered hand dexterous manipulation is quite challenging in the domain of robotics. One remaining issue is how to achieve compliant behaviors. In this work, we propose a human-in-the-loop learning-control approach for acquiring compliant grasping and manipulation skills of a multifinger robot hand. This approach takes the depth image of the human hand as input and generates the desired force commands for the robot. The markerless vision-based teleoperation system is used for the task demonstration, and an end-to-end neural network model (i.e., TeachNet) is trained to map the pose of the human hand to the joint angles of the robot hand in real-time. To endow the robot hand with compliant human-like behaviors, an adaptive force control strategy is designed to predict the desired force control commands based on the pose difference between the robot hand and the human hand during the demonstration. The force controller is derived from a computational model of the biomimetic control strategy in human motor learning, which allows adapting the control variables (impedance and feedforward force) online during the execution of the reference joint angles. The simultaneous adaptation of the impedance and feedforward profiles enables the robot to interact with the environment compliantly. Our approach has been verified in both simulation and real-world task scenarios based on a multifingered robot hand, that is, the Shadow Hand, and has shown more reliable performances than the current widely used position control mode for obtaining compliant grasping and manipulation behaviors.

15.
IEEE Trans Pattern Anal Mach Intell ; 45(5): 5481-5496, 2023 May.
Artículo en Inglés | MEDLINE | ID: mdl-36178992

RESUMEN

Multimodal fusion and multitask learning are two vital topics in machine learning. Despite the fruitful progress, existing methods for both problems are still brittle to the same challenge-it remains dilemmatic to integrate the common information across modalities (resp. tasks) meanwhile preserving the specific patterns of each modality (resp. task). Besides, while they are actually closely related to each other, multimodal fusion and multitask learning are rarely explored within the same methodological framework before. In this paper, we propose Channel-Exchanging-Network (CEN) which is self-adaptive, parameter-free, and more importantly, applicable for multimodal and multitask dense image prediction. At its core, CEN adaptively exchanges channels between subnetworks of different modalities. Specifically, the channel exchanging process is self-guided by individual channel importance that is measured by the magnitude of Batch-Normalization (BN) scaling factor during training. For the application of dense image prediction, the validity of CEN is tested by four different scenarios: multimodal fusion, cycle multimodal fusion, multitask learning, and multimodal multitask learning. Extensive experiments on semantic segmentation via RGB-D data and image translation through multi-domain input verify the effectiveness of CEN compared to state-of-the-art methods. Detailed ablation studies have also been carried out, which demonstrate the advantage of each component we propose. Our code is available at https://github.com/yikaiw/CEN.

16.
Biomimetics (Basel) ; 8(3)2023 Jul 24.
Artículo en Inglés | MEDLINE | ID: mdl-37504216

RESUMEN

Myoelectric control for prosthetic hands is an important topic in the field of rehabilitation. Intuitive and intelligent myoelectric control can help amputees to regain upper limb function. However, current research efforts are primarily focused on developing rich myoelectric classifiers and biomimetic control methods, limiting prosthetic hand manipulation to simple grasping and releasing tasks, while rarely exploring complex daily tasks. In this article, we conduct a systematic review of recent achievements in two areas, namely, intention recognition research and control strategy research. Specifically, we focus on advanced methods for motion intention types, discrete motion classification, continuous motion estimation, unidirectional control, feedback control, and shared control. In addition, based on the above review, we analyze the challenges and opportunities for research directions of functionality-augmented prosthetic hands and user burden reduction, which can help overcome the limitations of current myoelectric control research and provide development prospects for future research.

17.
Artículo en Inglés | MEDLINE | ID: mdl-37494169

RESUMEN

It has been discovered that graph convolutional networks (GCNs) encounter a remarkable drop in performance when multiple layers are piled up. The main factor that accounts for why deep GCNs fail lies in oversmoothing, which isolates the network output from the input with the increase of network depth, weakening expressivity and trainability. In this article, we start by investigating refined measures upon DropEdge-an existing simple yet effective technique to relieve oversmoothing. We term our method as DropEdge ++ for its two structure-aware samplers in contrast to DropEdge: layer-dependent (LD) sampler and feature-dependent (FD) sampler. Regarding the LD sampler, we interestingly find that increasingly sampling edges from the bottom layer yields superior performance than the decreasing counterpart as well as DropEdge. We theoretically reveal this phenomenon with mean-edge-number (MEN), a metric closely related to oversmoothing. For the FD sampler, we associate the edge sampling probability with the feature similarity of node pairs and prove that it further correlates the convergence subspace of the output layer with the input features. Extensive experiments on several node classification benchmarks, including both full-and semi-supervised tasks, illustrate the efficacy of DropEdge ++ and its compatibility with a variety of backbones by achieving generally better performance over DropEdge and the no-drop version.

18.
IEEE Trans Pattern Anal Mach Intell ; 45(7): 8861-8873, 2023 07.
Artículo en Inglés | MEDLINE | ID: mdl-37021866

RESUMEN

Adversarial attacks can easily fool object recognition systems based on deep neural networks (DNNs). Although many defense methods have been proposed in recent years, most of them can still be adaptively evaded. One reason for the weak adversarial robustness may be that DNNs are only supervised by category labels and do not have part-based inductive bias like the recognition process of humans. Inspired by a well-known theory in cognitive psychology - recognition-by-components, we propose a novel object recognition model ROCK (Recognizing Object by Components with human prior Knowledge). It first segments parts of objects from images, then scores part segmentation results with predefined human prior knowledge, and finally outputs prediction based on the scores. The first stage of ROCK corresponds to the process of decomposing objects into parts in human vision. The second stage corresponds to the decision process of the human brain. ROCK shows better robustness than classical recognition models across various attack settings. These results encourage researchers to rethink the rationality of currently widely-used DNN-based object recognition models and explore the potential of part-based models, once important but recently ignored, for improving robustness.


Asunto(s)
Algoritmos , Redes Neurales de la Computación , Humanos , Encéfalo , Percepción Visual
19.
Artículo en Inglés | MEDLINE | ID: mdl-37756172

RESUMEN

The classification problem for short time-window steady-state visual evoked potentials (SSVEPs) is important in practical applications because shorter time-window often means faster response speed. By combining the advantages of the local feature learning ability of convolutional neural network (CNN) and the feature importance distinguishing ability of attention mechanism, a novel network called AttentCNN is proposed to further improve the classification performance for short time-window SSVEP. Considering the frequency-domain features extracted from short time-window signals are not obvious, this network starts with the time-domain feature extraction module based on the filter bank (FB). The FB consists of four sixth-order Butterworth filters with different bandpass ranges. Then extracted multimodal features are aggregated together. The second major module is a set of residual squeeze and excitation blocks (RSEs) that has the ability to improve the quality of extracted features by learning the interdependence between features. The final major module is time-domain CNN (tCNN) that consists of four CNNs for further feature extraction and followed by a fully connected (FC) layer for output. Our designed networks are validated over two large public datasets, and necessary comparisons are given to verify the effectiveness and superiority of the proposed network. In the end, in order to demonstrate the application potential of the proposed strategy in the medical rehabilitation field, we design a novel five-finger bionic hand and connect it to our trained network to achieve the control of bionic hand by human brain signals directly. Our source codes are available on Github: https://github.com/JiannanChen/AggtCNN.git.

20.
IEEE Trans Pattern Anal Mach Intell ; 45(10): 11948-11960, 2023 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-37195849

RESUMEN

In this paper, we propose a novel Knowledge-based Embodied Question Answering (K-EQA) task, in which the agent intelligently explores the environment to answer various questions with the knowledge. Different from explicitly specifying the target object in the question as existing EQA work, the agent can resort to external knowledge to understand more complicated question such as "Please tell me what are objects used to cut food in the room?", in which the agent must know the knowledge such as "knife is used for cutting food". To address this K-EQA problem, a novel framework based on neural program synthesis reasoning is proposed, where the joint reasoning of the external knowledge and 3D scene graph is performed to realize navigation and question answering. Especially, the 3D scene graph can provide the memory to store the visual information of visited scenes, which significantly improves the efficiency for the multi-turn question answering. Experimental results have demonstrated that the proposed framework is capable of answering more complicated and realistic questions in the embodied environment. The proposed method is also applicable to multi-agent scenarios.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA