RESUMO
Autonomous driving vehicles rely on sensors for the robust perception of their surroundings. Such vehicles are equipped with multiple perceptive sensors with a high level of redundancy to ensure safety and reliability in any driving condition. However, multi-sensor, such as camera, LiDAR, and radar systems raise requirements related to sensor calibration and synchronization, which are the fundamental blocks of any autonomous system. On the other hand, sensor fusion and integration have become important aspects of autonomous driving research and directly determine the efficiency and accuracy of advanced functions such as object detection and path planning. Classical model-based estimation and data-driven models are two mainstream approaches to achieving such integration. Most recent research is shifting to the latter, showing high robustness in real-world applications but requiring large quantities of data to be collected, synchronized, and properly categorized. However, there are two major research gaps in existing works: (i) they lack fusion (and synchronization) of multi-sensors, camera, LiDAR and radar; and (ii) generic scalable, and user-friendly end-to-end implementation. To generalize the implementation of the multi-sensor perceptive system, we introduce an end-to-end generic sensor dataset collection framework that includes both hardware deploying solutions and sensor fusion algorithms. The framework prototype integrates a diverse set of sensors, such as camera, LiDAR, and radar. Furthermore, we present a universal toolbox to calibrate and synchronize three types of sensors based on their characteristics. The framework also includes the fusion algorithms, which utilize the merits of three sensors, namely, camera, LiDAR, and radar, and fuse their sensory information in a manner that is helpful for object detection and tracking research. The generality of this framework makes it applicable in any robotic or autonomous applications and suitable for quick and large-scale practical deployment.
RESUMO
When traditional super-resolution reconstruction methods are applied to infrared thermal images, they often ignore the problem of poor image quality caused by the imaging mechanism, which makes it difficult to obtain high-quality reconstruction results even with the training of simulated degraded inverse processes. To address these issues, we proposed a thermal infrared image super-resolution reconstruction method based on multimodal sensor fusion, aiming to enhance the resolution of thermal infrared images and rely on multimodal sensor information to reconstruct high-frequency details in the images, thereby overcoming the limitations of imaging mechanisms. First, we designed a novel super-resolution reconstruction network, which consisted of primary feature encoding, super-resolution reconstruction, and high-frequency detail fusion subnetwork, to enhance the resolution of thermal infrared images and rely on multimodal sensor information to reconstruct high-frequency details in the images, thereby overcoming limitations of imaging mechanisms. We designed hierarchical dilated distillation modules and a cross-attention transformation module to extract and transmit image features, enhancing the network's ability to express complex patterns. Then, we proposed a hybrid loss function to guide the network in extracting salient features from thermal infrared images and reference images while maintaining accurate thermal information. Finally, we proposed a learning strategy to ensure the high-quality super-resolution reconstruction performance of the network, even in the absence of reference images. Extensive experimental results show that the proposed method exhibits superior reconstruction image quality compared to other contrastive methods, demonstrating its effectiveness.
RESUMO
Recently, artificial intelligence (AI) based on IoT sensors has been widely used, which has increased the risk of attacks targeting AI. Adversarial examples are among the most serious types of attacks in which the attacker designs inputs that can cause the machine learning system to generate incorrect outputs. Considering the architecture using multiple sensor devices, hacking even a few sensors can create a significant risk; an attacker can attack the machine learning model through the hacked sensors. Some studies demonstrated the possibility of adversarial examples on the deep neural network (DNN) model based on IoT sensors, but it was assumed that an attacker must access all features. The impact of hacking only a few sensors has not been discussed thus far. Therefore, in this study, we discuss the possibility of attacks on DNN models by hacking only a small number of sensors. In this scenario, the attacker first hacks few sensors in the system, obtains the values of the hacked sensors, and changes them to manipulate the system, but the attacker cannot obtain and change the values of the other sensors. We perform experiments using the human activity recognition model with three sensor devices attached to the chest, wrist, and ankle of a user, and demonstrate that attacks are possible by hacking a small number of sensors.
Assuntos
Inteligência Artificial , Aprendizado Profundo , Humanos , Redes Neurais de Computação , Aprendizado de Máquina , Atividades HumanasRESUMO
Industries need a mechanism to monitor the workers' safety and to prevent Work-related Musculoskeletal Disorders (WMSDs). The development of ergonomics assessment tools helps the industry evaluate workplace design and worker posture. Many studies proposed the automated ergonomics assessment method to replace the manual; however, it only focused on calculating body angle and assessing the wrist section manually. This study aims to (a) propose a wrist kinematics measurement based on unobtrusive sensors, (b) detect potential WMSDs related to wrist posture, and (c) compare the wrist posture of subjects while performing assembly tasks to achieve a comprehensive and personalized ergonomic assessment. The wrist posture measurement is combined with the body posture measurement to provide a comprehensive ergonomics assessment based on RULA. Data were collected from subjects who performed the assembly process to evaluate our method. We compared the risk score assessed by the ergonomist and the risk score generated by our method. All body segments achieved more than an 80% similarity score, enhancing the scores for wrist position and wrist twist by 6.8% and 0.3%, respectively. A hypothesis analysis was conducted to evaluate the difference across the subjects. The results indicate that every subject performs tasks differently and has different potential risks regarding wrist posture.
Assuntos
Articulação do Punho , Punho , Humanos , Movimento (Física) , Indústrias , PosturaRESUMO
Hydrocephalus is a medical condition characterized by the abnormal accumulation of cerebrospinal fluid (CSF) within the cavities of the brain called ventricles. It frequently follows pediatric and adult congenital malformations, stroke, meningitis, aneurysmal rupture, brain tumors, and traumatic brain injury. CSF diversion devices, or shunts, have become the primary therapy for hydrocephalus treatment for nearly 60 years. However, routine treatment complications associated with a shunt device are infection, obstruction, and over drainage. Although some (regrettably, the minority) patients with shunts can go for years without complications, even those lucky few may potentially experience one shunt malfunction; a shunt complication can require emergency intervention. Here, we present a soft, wireless device that monitors distal terminal fluid flow and transmits measurements to a smartphone via a low-power Bluetooth communication when requested. The proposed multimodal sensing device enabled by flow sensors, for measurements of flow rate and electrodes for measurements of resistance in a fluidic chamber, allows precision measurement of CSF flow rate over a long time and under any circumstances caused by unexpected or abnormal events. A universal design compatible with any modern commercial spinal fluid shunt system would enable the widespread use of this technology.
Assuntos
Derivações do Líquido Cefalorraquidiano , Hidrocefalia , Adulto , Derivações do Líquido Cefalorraquidiano/efeitos adversos , Criança , Humanos , Hidrocefalia/diagnóstico , Hidrocefalia/cirurgia , Próteses e ImplantesRESUMO
The quantitative characterization of movement disorders and their related neurophysiological signals is important for the management of Parkinson's disease (PD). The aim of this study is to develop a novel wearable system enabling the simultaneous measurement of both motion and other neurophysiological signals in PD patients. We designed a wearable system that consists of five motion sensors and three electrophysiology sensors to measure the motion signals of the body, electroencephalogram, electrocardiogram, and electromyography, respectively. The data captured by the sensors are transferred wirelessly in real time, and the outcomes are analyzed and uploaded to the cloud-based server automatically. We completed pilot studies to (1) test its validity by comparing outcomes to the commercialized systems, and (2) evaluate the deep brain stimulation (DBS) treatment effects in seven PD patients. Our results showed: (1) the motion and neurophysiological signals measured by this wearable system were strongly correlated with those measured by the commercialized systems (r > 0.94, p < 0.001); and (2) by completing the clinical supination and pronation frequency test, the frequency of motion as measured by this system increased when DBS was turned on. The results demonstrated that this multi-sensor wearable system can be utilized to quantitatively characterize and monitor motion and neurophysiological PD.
Assuntos
Monitorização Fisiológica/instrumentação , Doença de Parkinson , Dispositivos Eletrônicos Vestíveis , Estimulação Encefálica Profunda , Eletrocardiografia , Eletroencefalografia , Eletromiografia , Humanos , Movimento , Doença de Parkinson/diagnóstico , Doença de Parkinson/terapia , Projetos PilotoRESUMO
In this paper, we present a multimodal dataset for affective computing research acquired in a human-computer interaction (HCI) setting. An experimental mobile and interactive scenario was designed and implemented based on a gamified generic paradigm for the induction of dialog-based HCI relevant emotional and cognitive load states. It consists of six experimental sequences, inducing Interest, Overload, Normal, Easy, Underload, and Frustration. Each sequence is followed by subjective feedbacks to validate the induction, a respiration baseline to level off the physiological reactions, and a summary of results. Further, prior to the experiment, three questionnaires related to emotion regulation (ERQ), emotional control (TEIQue-SF), and personality traits (TIPI) were collected from each subject to evaluate the stability of the induction paradigm. Based on this HCI scenario, the University of Ulm Multimodal Affective Corpus (uulmMAC), consisting of two homogenous samples of 60 participants and 100 recording sessions was generated. We recorded 16 sensor modalities including 4 × video, 3 × audio, and 7 × biophysiological, depth, and pose streams. Further, additional labels and annotations were also collected. After recording, all data were post-processed and checked for technical and signal quality, resulting in the final uulmMAC dataset of 57 subjects and 95 recording sessions. The evaluation of the reported subjective feedbacks shows significant differences between the sequences, well consistent with the induced states, and the analysis of the questionnaires shows stable results. In summary, our uulmMAC database is a valuable contribution for the field of affective computing and multimodal data analysis: Acquired in a mobile interactive scenario close to real HCI, it consists of a large number of subjects and allows transtemporal investigations. Validated via subjective feedbacks and checked for quality issues, it can be used for affective computing and machine learning applications.
Assuntos
Reconhecimento Visual de Modelos/fisiologia , Interface Usuário-Computador , Emoções/fisiologia , Humanos , Aprendizado de MáquinaRESUMO
We report on the adaptation of a smartphone's rear-facing camera to function as a spectrometer that measures the spectrum of light scattered by common paper-based assay test strips. We utilize a cartridge that enables a linear series of test pads in a single strip to be swiped past the read head of the instrument while the phone's camera records video. The strip is housed in a custom-fabricated cartridge that slides through the instrument to facilitate illumination with white light from the smartphone's flash LED that is directed through an optical fiber. We demonstrate the ability to detect subtle changes in the scattered spectrum that enables quantitative analysis of single-analyte and multi-analyte strips. The demonstrated capability can be applied to broad classes of paper-based assays in which visual observation of colored strips is not sufficiently quantitative, and for which analysis of red-green-blue pixel values of a camera image are not capable of measuring complex scattered spectra.
RESUMO
Sign language provides hearing and speech impaired individuals with an interface to communicate with other members of the society. Unfortunately, sign language is not understood by most of the common people. For this, a gadget based on image processing and pattern recognition can provide with a vital aid for detecting and translating sign language into a vocal language. This work presents a system for detecting and understanding the sign language gestures by a custom built software tool and later translating the gesture into a vocal language. For the purpose of recognizing a particular gesture, the system employs a Dynamic Time Warping (DTW) algorithm and an off-the-shelf software tool is employed for vocal language generation. Microsoft(®) Kinect is the primary tool used to capture video stream of a user. The proposed method is capable of successfully detecting gestures stored in the dictionary with an accuracy of 91%. The proposed system has the ability to define and add custom made gestures. Based on an experiment in which 10 individuals with impairments used the system to communicate with 5 people with no disability, 87% agreed that the system was useful.
Assuntos
Auxiliares de Comunicação para Pessoas com Deficiência , Correção de Deficiência Auditiva/instrumentação , Gestos , Distúrbios da Fala/reabilitação , Interface Usuário-Computador , Jogos de Vídeo , Acelerometria/instrumentação , Correção de Deficiência Auditiva/métodos , Desenho de Equipamento , Análise de Falha de Equipamento , Feminino , Mãos/fisiologia , Transtornos da Audição , Humanos , Masculino , Aplicativos Móveis , Movimento/fisiologia , Paquistão , Reconhecimento Automatizado de Padrão/métodos , Projetos Piloto , Língua de Sinais , Tradução , Resultado do Tratamento , Gravação em Vídeo , Adulto JovemRESUMO
Multimodal flexible sensors, consisting of multiple sensing units, can sense and recognize different external stimuli by outputting different types of response signals. However, the recovery and recycling of multimodal sensors are impeded by complex structures and the use of multiple materials. Here, a bimodal flexible sensor that can sense strain by resistance change and temperature by voltage change was constructed using poly(vinyl alcohol) hydrogel as a matrix and poly(3,4-ethylenedioxythiophene)/poly(styrenesulfonate) (PEDOT:PSS) as a sensing material due to its conductivity and thermoelectric effect. The plasticity of hydrogels, along with the simplicity of the sensor's components and structure, facilitates easy recovery and recycling. The incorporation of citric acid and ethylene glycol improved the mechanical properties, strain hysteresis, and antifreezing properties of the hydrogels. The sensor exhibits a remarkable response to strain, characterized by high sensitivity (gauge factor of 4.46), low detection limit (0.1%), fast response and recovery times, minimal hysteresis, and excellent stability. Temperature changes induced by hot air currents, hot objects, and light cause the sensor to exhibit high response sensitivity, fast response time, and good stability. Additionally, variations in ambient humidity and temperature minimally affect the sensor's strain response, and temperature response remains unaffected by humidity changes. The recycled sensors are essentially unchanged for bimodal sensing of strain and temperature. Finally, bimodal sensors are applied to monitor body motion, and robots to sense external stimuli.
RESUMO
Pain sensation is a crucial aspect of perception in the body. Force-activated nociceptors encode electrochemical signals and yield multilevel information of pain, thus enabling smart feedback. Inspired by the natural template, multi-dimensional mechano-sensing materials provide promising approaches for biomimetic nociceptors in intelligent terminals. However, the reliance on non-centrosymmetric crystals has narrowed the range of these materials. Here centrosymmetric crystal Cr3+ -doped zinc gallogermanate (ZGGO:Cr) with multi-dimensional mechano-sensing is reported, eliminating the limitation of crystal structure. Under forces, ZGGO:Cr generates electrical signals imitating those of neuronal systems, and produces luminescence for spatial mapping of mechanical stimuli, suggesting a path toward bionic pain perception. On that basis, a wireless biomimetic nociceptor system is developed and a smart pain reflex in a robotic hand and robot-assisted biopsy surgery of rat and dog is achieved.
Assuntos
Biomimética , Nociceptores , Ratos , Animais , Cães , Dor , Inteligência Artificial , NeurôniosRESUMO
Electronic skin (e-skin) capable of acquiring environmental and physiological information has attracted interest for healthcare, robotics, and human-machine interaction. However, traditional 2D e-skin only allows for in-plane force sensing, which limits access to comprehensive stimulus feedback due to the lack of out-of-plane signal detection caused by its 3D structure. Here, a dimension-switchable bioinspired receptor is reported to achieve multimodal perception by exploiting film kirigami. It offers the detection of in-plane (pressure and bending) and out-of-plane (force and airflow) signals by dynamically inducing the opening and reclosing of sensing unit. The receptor's hygroscopic and thermoelectric properties enable the sensing of humidity and temperature. Meanwhile, the thermoelectric receptor can differentiate mechanical stimuli from temperature by the voltage. The development enables a wide range of sensory capabilities of traditional e-skin and expands the applications in real life.
Assuntos
Materiais Biomiméticos , Humanos , Materiais Biomiméticos/química , Dispositivos Eletrônicos Vestíveis , Temperatura , Biomimética/métodos , Umidade , Pele Artificial , Pressão , Receptores Artificiais/químicaRESUMO
With the growing demand for eco-friendly materials in wearable smart electronic devices, renewable, biocompatible, and low-cost hydrogels based on natural polymers have attracted much attention. Cellulose, as one of the renewable and degradable natural polymers, shows great potential in wearable smart electronic devices. Multifunctional conductive cellulose-based hydrogels are designed for flexible electronic devices by adding sodium carboxymethyl cellulose and MXene into polyacrylic acid networks. The multifunctional hydrogels possess excellent mechanical property (stress: 310 kPa; strain: 1127 %), toughness (206.67 KJ m-3), conductivity (1.09 ± 0.12 S m-1) and adhesion (82.19 ± 3.65 kPa). The multifunctional conductive hydrogels serve as strain sensors (Gauge Factor (GF) = 5.79, 0-700 % strain; GF = 14.0, 700-900 % strain; GF = 40.36, 900-1000 % strain; response time: 300 ms; recovery time: 200 ms) and temperature sensors (Temperature coefficient of resistance (TCR) = 2.5755 °C-1 at 35 °C- 60 °C). The sensor detects human activities with clear and steady signals. A distributed array of flexible sensors is created to measure the magnitude and distribution of pressure and a hydrogel-based flexible touch keyboard is also fabricated to recognize writing trajectories, pressures and speeds. Furthermore, a flexible hydrogel-based supercapacitor powers the LED and exhibits good cyclic stability over 15,000 charge-discharge cycles.
RESUMO
The somatosensory system is crucial for living beings to survive and thrive in complex environments and to interact with their surroundings. Similarly, rapidly developed soft robots need to be aware of their own posture and detect external stimuli. Bending and force sensing are key for soft machines to achieve embodied intelligence. Here, we present a soft inductive bimodal sensor (SIBS) that uses the strain modulation of magnetic permeability and the eddy-current effect for simultaneous bidirectional bending and force sensing with only two wires. The SIBS is made of a flexible planar coil, a porous ferrite film, and a soft conductive film. By measuring the inductance at two different frequencies, the bending angle and force can be obtained and decoupled. Rigorous experiments revealed that the SIBS can achieve high resolution (0.44° bending and 1.09 mN force), rapid response, excellent repeatability, and high durability. A soft crawling robot embedded with one SIBS can sense its own shape and interact with and respond to external stimuli. Moreover, the SIBS is demonstrated as a wearable human-machine interaction to control a crawling robot via wrist bending and touching. This highlights that the SIBS can be readily implemented in diverse applications for reliable bimodal sensing.
RESUMO
In this study, we investigated a novel approach to fabricate multifunctional ionic gel sensors by using deep eutectic solvents (DESs) as replacements for water. When two distinct DESs were combined, customizable mechanical and conductive properties were created, resulting in improved performance compared with traditional hydrogel-based strain sensors. DES ionic gels possess superior mechanical properties, transparency, biocompatibility, and antimicrobial properties, making them suitable for a wide range of applications such as flexible electronics, soft robotics, and healthcare. We conducted a comprehensive evaluation of the DES ionic gels, evaluating their performance under extreme temperature conditions (-70 to 80 °C), impressive optical transparency (94%), and biocompatibility. Furthermore, a series of tests were conducted to evaluate the antibacterial performance (Escherichia coli) of the DES ionic gels. Their wide strain (1-400%) and temperature (15-50 °C)-sensing ranges demonstrate the versatility and adaptability of DES ionic gels for diverse sensing requirements. The resulting DES ionic gels were successfully applied in human activity and vital sign monitoring, demonstrating their potential for biointegrated sensing devices and healthcare applications. This study offers valuable insights into the development and optimization of hydrogel sensors, particularly for applications that require environmental stability, biocompatibility, and antibacterial performance, thereby paving the way for future advancements in this field.
Assuntos
Antibacterianos , Solventes Eutéticos Profundos , Humanos , Solventes , Antibacterianos/farmacologia , Hidrogéis/farmacologia , Água , Escherichia coli , ÍonsRESUMO
As key interfaces for the disabled, optimal prosthetics should elicit natural sensations of skin touch or proprioception, by unambiguously delivering the multimodal signals acquired by the prosthetics to the nervous system, which still remains challenging. Here, a bioinspired temperature-pressure electronic skin with decoupling capability (TPD e-skin), inspired by the high-low modulus hierarchical structure of human skin, is developed to restore such functionality. Due to the bionic dual-state amplifying microstructure and contact resistance modulation, the MXene TPD e-skin exhibits high sensitivity over a wide pressure range and excellent temperature insensitivity (91.2% reduction). Additionally, the high-low modulus structural configuration enables the pressure insensitivity of the thermistor. Furthermore, a neural model is proposed to neutrally code the temperature-pressure signals into three types of nerve-acceptable frequency signals, corresponding to thermoreceptors, slow-adapting receptors, and fast-adapting receptors. Four operational states in the time domain are also distinguished after the neural coding in the frequency domain. Besides, a brain-like machine learning-based fusion process for frequency signals is also constructed to analyze the frequency pattern and achieve object recognition with a high accuracy of 98.7%. The TPD neural system offers promising potential to enable advanced prosthetic devices with the capability of multimodality-decoupling sensing and deep neural integration.
Assuntos
Pele , Dispositivos Eletrônicos Vestíveis , Humanos , Módulo de Elasticidade , Pele/química , Tato/fisiologiaRESUMO
Socially Assistive Robots (SARs) are designed to support us in our daily life as a companion, and assistance but also to support the caregivers' work. SARs should show personalized and human-like behavior to improve their acceptance and, consequently, their use. Additionally, they should be trustworthy by caregivers and professionals to be used as support for their work (e.g. objective assessment, decision support tools). In this context the aim of the paper is dual. Firstly, this paper aims to present and discuss the robot behavioral model based on sensing, perception, decision support, and interaction modules. The novel idea behind the proposed model is to extract and use the same multimodal features set for two purposes: (i) to profile the user, so to be used by the caregiver as a decision support tool for the assessment and monitoring of the patient; (ii) to fine-tune the human-robot interaction if they can be correlated to the social cues. Secondly, this paper aims to test in a real environment the proposed model using a SAR robot, namely ASTRO. Particularly, it measures the body posture, the gait cycle, and the handgrip strength during the walking support task. Those collected data were analyzed to assess the clinical profile and to fine-tune the physical interaction. Ten older people (65.2 ± 15.6 years) were enrolled for this study and were asked to walk with ASTRO at their normal speed for 10 m. The obtained results underline a good estimation (p < 0.05) of gait parameters, handgrip strength, and angular excursion of the torso with respect to most used instruments. Additionally, the sensory outputs were combined in the perceptual model to profile the user using non-classical and unsupervised techniques for dimensionality reduction namely T-distributed Stochastic Neighbor Embedding (t-SNE) and non-classic multidimensional scaling (nMDS). Indeed, these methods can group the participants according to their residual walking abilities.
RESUMO
Deaf-mutes face many difficulties in daily interactions with hearing people through spoken language. Sign language is an important way of expression and communication for deaf-mutes. Therefore, breaking the communication barrier between the deaf-mute and hearing communities is significant for facilitating their integration into society. To help them integrate into social life better, we propose a multimodal Chinese sign language (CSL) gesture interaction framework based on social robots. The CSL gesture information including both static and dynamic gestures is captured from two different modal sensors. A wearable Myo armband and a Leap Motion sensor are used to collect human arm surface electromyography (sEMG) signals and hand 3D vectors, respectively. Two modalities of gesture datasets are preprocessed and fused to improve the recognition accuracy and to reduce the processing time cost of the network before sending it to the classifier. Since the input datasets of the proposed framework are temporal sequence gestures, the long-short term memory recurrent neural network is used to classify these input sequences. Comparative experiments are performed on an NAO robot to test our method. Moreover, our method can effectively improve CSL gesture recognition accuracy, which has potential applications in a variety of gesture interaction scenarios not only in social robots.
RESUMO
Flexible wearable devices have been widely used in biomedical applications, the Internet of Things, and other fields, attracting the attention of many researchers. The physiological and biochemical information on the human body reflects various health states, providing essential data for human health examination and personalized medical treatment. Meanwhile, physiological and biochemical information reveals the moving state and position of the human body, and it is the data basis for realizing human-computer interactions. Flexible wearable physiological and biochemical sensors provide real-time, human-friendly monitoring because of their light weight, wearability, and high flexibility. This paper reviews the latest advancements, strategies, and technologies of flexibly wearable physiological and biochemical sensors (pressure, strain, humidity, saliva, sweat, and tears). Next, we systematically summarize the integration principles of flexible physiological and biochemical sensors with the current research progress. Finally, important directions and challenges of physiological, biochemical, and multimodal sensors are proposed to realize their potential applications for human movement, health monitoring, and personalized medicine.
Assuntos
Dispositivos Eletrônicos Vestíveis , Humanos , Suor , Saliva , LágrimasRESUMO
Advanced machine intelligence is empowered not only by the ever-increasing computational capability for information processing but also by sensors for collecting multimodal information from complex environments. However, simply assembling different sensors can result in bulky systems and complex data processing. Herein, it is shown that a complementary metal-oxide-semiconductor (CMOS) imager can be transformed into a compact multimodal sensing platform through dual-focus imaging. By combining lens-based and lensless imaging, visual information, chemicals, temperature, and humidity can be detected with the same chip and output as a single image. As a proof of concept, the sensor is equipped on a micro-vehicle, and multimodal environmental sensing and mapping is demonstrated. A multimodal endoscope is also developed, and simultaneous imaging and chemical profiling along a porcine digestive tract is achieved. The multimodal CMOS imager is compact, versatile, and extensible and can be widely applied in microrobots, in vivo medical apparatuses, and other microdevices.