RESUMO
Custom inspection using X-ray imaging is a very promising application of modern pattern recognition technology. However, the lack of data or renewal of tariff items makes the application of such technology difficult. In this paper, we present a data augmentation technique based on a new image-to-image translation method to deal with these difficulties. Unlike the conventional methods that convert a semantic label image into a realistic image, the proposed method takes a texture map with a special modification as an additional input of a generative adversarial network to reproduce domain-specific characteristics, such as background clutter or sensor-specific noise patterns. The proposed method was validated by applying it to backscatter X-ray (BSX) vehicle data augmentation. The Fréchet inception distance (FID) of the result indicates the visual quality of the translated image was significantly improved from the baseline when the texture parameters were used. Additionally, in terms of data augmentation, the experimental results of classification, segmentation, and detection show that the use of the translated image data, along with the real data consistently, improved the performance of the trained models. Our findings show that detailed depiction of the texture in translated images is crucial for data augmentation. Considering the comparatively few studies that have examined custom inspections of container scale goods, such as cars, we believe that this study will facilitate research on the automation of container screening, and the security of aviation and ports.
Assuntos
Aprendizado Profundo , Processamento de Imagem Assistida por Computador , Radiografia , Raios XRESUMO
Facial expressions are one of the important non-verbal ways used to understand human emotions during communication. Thus, acquiring and reproducing facial expressions is helpful in analyzing human emotional states. However, owing to complex and subtle facial muscle movements, facial expression modeling from images with face poses is difficult to achieve. To handle this issue, we present a method for acquiring facial expressions from a non-frontal single photograph using a 3D-aided approach. In addition, we propose a contour-fitting method that improves the modeling accuracy by automatically rearranging 3D contour landmarks corresponding to fixed 2D image landmarks. The acquired facial expression input can be parametrically manipulated to create various facial expressions through a blendshape or expression transfer based on the FACS (Facial Action Coding System). To achieve a realistic facial expression synthesis, we propose an exemplar-texture wrinkle synthesis method that extracts and synthesizes appropriate expression wrinkles according to the target expression. To do so, we constructed a wrinkle table of various facial expressions from 400 people. As one of the applications, we proved that the expression-pose synthesis method is suitable for expression-invariant face recognition through a quantitative evaluation, and showed the effectiveness based on a qualitative evaluation. We expect our system to be a benefit to various fields such as face recognition, HCI, and data augmentation for deep learning.
Assuntos
Face , Expressão Facial , Simulação por Computador , Emoções , Músculos Faciais , Humanos , Imageamento Tridimensional , MovimentoRESUMO
This paper presents a novel three-dimensional (3D) multi-spectrum sensor system, which combines a 3D depth sensor and multiple optical sensors for different wavelengths. Various image sensors, such as visible, infrared (IR) and 3D sensors, have been introduced into the commercial market. Since each sensor has its own advantages under various environmental conditions, the performance of an application depends highly on selecting the correct sensor or combination of sensors. In this paper, a sensor system, which we will refer to as a 3D multi-spectrum sensor system, which comprises three types of sensors, visible, thermal-IR and time-of-flight (ToF), is proposed. Since the proposed system integrates information from each sensor into one calibrated framework, the optimal sensor combination for an application can be easily selected, taking into account all combinations of sensors information. To demonstrate the effectiveness of the proposed system, a face recognition system with light and pose variation is designed. With the proposed sensor system, the optimal sensor combination, which provides new effectively fused features for a face recognition system, is obtained.
Assuntos
Biometria/instrumentação , Colorimetria/instrumentação , Face/anatomia & histologia , Interpretação de Imagem Assistida por Computador/instrumentação , Imageamento Tridimensional/instrumentação , Reconhecimento Automatizado de Padrão/métodos , Transdutores , Desenho de Equipamento , Análise de Falha de Equipamento , HumanosRESUMO
Childhood to adolescence is an accelerated growth period, and genetic features can influence differences of individual growth patterns. In this study, we examined the genetic basis of early age facial growth (EAFG) patterns. Facial shape phenotypes were defined using facial landmark distances, identifying five growth patterns: continued-decrease, decrease-to-increase, constant, increase-to-decrease, and continued-increase. We conducted genome-wide association studies (GWAS) for 10 horizontal and 11 vertical phenotypes. The most significant association for horizontal phenotypes was rs610831 (TRIM29; ß = 0.92, p-value = 1.9 × 10-9) and for vertical phenotypes was rs6898746 (ZSWIM6; ß = 0.1103, p-value = 2.5 × 10-8). It is highly correlated with genes already reported for facial growth. This study is the first to classify and characterize facial growth patterns and related genetic polymorphisms.
Assuntos
Face , Estudo de Associação Genômica Ampla , Desenvolvimento Maxilofacial , Povo Asiático/genética , Proteínas de Ligação a DNA/genética , Humanos , Desenvolvimento Maxilofacial/genética , Fenótipo , República da Coreia , Fatores de Transcrição/genéticaRESUMO
Scanning and acquiring a 3D indoor environment suffers from complex occlusions and misalignment errors. The reconstruction obtained from an RGB-D scanner contains holes in geometry and ghosting in texture. These are easily noticeable and cannot be considered as visually compelling VR content without further processing. On the other hand, the well-known Manhattan World priors successfully recreate relatively simple structures. In this article, we would like to push the limit of planar representation in indoor environments. Given an initial 3D reconstruction captured by an RGB-D sensor, we use planes not only to represent the environment geometrically but also to solve an inverse rendering problem considering texture and light. The complex process of shape inference and intrinsic imaging is greatly simplified with the help of detected planes and yet produces a realistic 3D indoor environment. The generated content can adequately represent the spatial arrangements for various AR/VR applications and can be readily composited with virtual objects possessing plausible lighting and texture.
RESUMO
Visual surveillance produces a significant amount of raw video data that can be time consuming to browse and analyze. In this work, we present a video synopsis methodology called "scene adaptive online video synopsis via dynamic tube rearrangement using octree (SSOcT)" that can effectively condense input surveillance videos. Our method entailed summarizing the input video by analyzing scene characteristics and determining an effective spatio-temporal 3D structure for video synopsis. For this purpose, we first analyzed the attributes of each extracted tube with respect to scene geometry and complexity. Then, we adaptively grouped the tubes using an online grouping algorithm that exploits these scene characteristics. Finally, the tube groups were dynamically rearranged using the proposed octree-based algorithm that efficiently inserted and refined tubes containing high spatio-temporal movements in real time. Extensive video synopsis experimental results are provided, demonstrating the effectiveness and efficiency of our method in summarizing real-world surveillance videos with diverse scene characteristics.
RESUMO
OBJECTIVES: This study aimed to demonstrate the application of our automated facial recognition system to measure facial nerve function and compare its effectiveness with other conventional systems and provide a preliminary evaluation of deep learning-facial grading systems. STUDY DESIGN: Retrospective, observational. SETTING: Tertiary referral center, hospital. PATIENTS: Facial photos taken from 128 patients with facial paralysis and two persons with no history of facial palsy were analyzed. INTERVENTION: Diagnostic. MAIN OUTCOME MEASURES: Correlation with Sunnybrook (SB) and House-Brackmann (HB) grading scales. RESULTS: Our results had good reliability and correlation with other grading systems (râ=â0.905 and 0.783 for Sunnybrook and HB grading scales, respectively), while being less time-consuming than Sunnybrook grading scale. CONCLUSIONS: Our objective method shows good correlation with both Sunnybrook and HB grading systems. Furthermore, this system could be developed into an application for use with a variety of electronic devices, including smartphones and tablets.
Assuntos
Paralisia Facial , Reconhecimento Facial , Assimetria Facial , Paralisia Facial/diagnóstico , Humanos , Reprodutibilidade dos Testes , Estudos RetrospectivosRESUMO
Growth and alterations in craniofacial morphology have attracted interest in many fields of science, especially physical anthropology, genetics and forensic sciences. We performed an analysis of craniofacial morphology alterations by gender and ageing stage in Korean populations. We studied 15 facial metrics using two large Korean populations (1,926 samples from the Korea Medicine Data Center cohort and 5,643 samples from the Ansan-Ansung cohort). Among the 15 metrics, 12 showed gender differences and tended to change with age. In both of the independent populations, brow ridge height, upper lip height, nasal tip height, and profile nasal length tended to increase with age, whereas outer canthal width, right palpebral fissure height, left palpebral fissure height, right upper lip thickness, left upper lip thickness, nasal tip protrusion, facial base width, and lower facial width tended to decrease. In conclusion, our findings suggest that ageing (past 40 years of age) might affect eye size, nose length, upper lip thickness, and facial width, possibly due to loss of elasticity in the face. Therefore, these facial metric changes could be applied to individual age prediction and aesthetic facial care.