RESUMO
Cloud computing has recently gained widespread attention owing to its use in applications involving the Internet of Things (IoT). However, the transmission of massive volumes of data to a cloud server often results in overhead. Fog computing has emerged as a viable solution to address this issue. This study implements an Artificial Intelligence of Things (AIoT) system based on fog computing on a smart farm. Three experiments are conducted to evaluate the performance of the AIoT system. First, network traffic volumes between systems employing and not employing fog computing are compared. Second, the performance of the communication protocols-hypertext transport protocol (HTTP), message queuing telemetry transport protocol (MQTT), and constrained application protocol (CoAP)-commonly used in IoT applications is assessed. Finally, a convolutional neural network-based algorithm is introduced to determine the maturity level of coffee tree images. Experimental data are collected over ten days from a coffee tree farm in the Republic of Korea. Notably, the fog computing system demonstrates a 26% reduction in the cumulative data volume compared with a non-fog system. MQTT exhibits stable results in terms of the data volume and loss rate. Additionally, the maturity level determination algorithm performed on coffee fruits provides reliable results.
RESUMO
Generative Adversarial Networks (GANs) for 3D volume generation and reconstruction, such as shape generation, visualization, automated design, real-time simulation, and research applications, are receiving increased amounts of attention in various fields. However, challenges such as limited training data, high computational costs, and mode collapse issues persist. We propose combining a Variational Autoencoder (VAE) and a GAN to uncover enhanced 3D structures and introduce a stable and scalable progressive growth approach for generating and reconstructing intricate voxel-based 3D shapes. The cascade-structured network involves a generator and discriminator, starting with small voxel sizes and incrementally adding layers, while subsequently supervising the discriminator with ground-truth labels in each newly added layer to model a broader voxel space. Our method enhances the convergence speed and improves the quality of the generated 3D models through stable growth, thereby facilitating an accurate representation of intricate voxel-level details. Through comparative experiments with existing methods, we demonstrate the effectiveness of our approach in evaluating voxel quality, variations, and diversity. The generated models exhibit improved accuracy in 3D evaluation metrics and visual quality, making them valuable across various fields, including virtual reality, the metaverse, and gaming.
RESUMO
Deep learning technology is generally applied to analyze periodic data, such as the data of electromyography (EMG) and acoustic signals. Conversely, its accuracy is compromised when applied to the anomalous and irregular nature of the data obtained using a magneto-impedance (MI) sensor. Thus, we propose and analyze a deep learning model based on recurrent neural networks (RNNs) optimized for the MI sensor, such that it can detect and classify data that are relatively irregular and diverse compared to the EMG and acoustic signals. Our proposed method combines the long short-term memory (LSTM) and gated recurrent unit (GRU) models to detect and classify metal objects from signals acquired by an MI sensor. First, we configured various layers used in RNN with a basic model structure and tested the performance of each layer type. In addition, we succeeded in increasing the accuracy by processing the sequence length of the input data and performing additional work in the prediction process. An MI sensor acquires data in a non-contact mode; therefore, the proposed deep learning approach can be applied to drone control, electronic maps, geomagnetic measurement, autonomous driving, and foreign object detection.
RESUMO
When a near-eye display (NED) device reproduces an image at a location close to the eye, the virtual image is implemented at a large angle. The uniformity of the image is unbalanced due to the change in diffraction efficiency by the hologram recording angle and angular selectivity. This study proposes a method for implementing an optimal uniform image by analyzing the diffraction efficiency and the reconstructed image was analyzed using angular selectivity generated while reproducing the source point of the diffused image as an intermediate element by holographic optical element (HOE). This research provides practical results for displaying high diffraction efficiency and immersive holographic images in the NED system with HOE as uniformed intermediate elements.
Assuntos
Holografia , Dispositivos Ópticos , Holografia/métodos , Imageamento TridimensionalRESUMO
Gaze is an excellent indicator and has utility in that it can express interest or intention and the condition of an object. Recent deep-learning methods are mainly appearance-based methods that estimate gaze based on a simple regression from entire face and eye images. However, sometimes, this method does not give satisfactory results for gaze estimations in low-resolution and noisy images obtained in unconstrained real-world settings (e.g., places with severe lighting changes). In this study, we propose a method that estimates gaze by detecting eye region landmarks through a single eye image; and this approach is shown to be competitive with recent appearance-based methods. Our approach acquires rich information by extracting more landmarks and including iris and eye edges, similar to the existing feature-based methods. To acquire strong features even at low resolutions, we used the HRNet backbone network to learn representations of images at various resolutions. Furthermore, we used the self-attention module CBAM to obtain a refined feature map with better spatial information, which enhanced the robustness to noisy inputs, thereby yielding a performance of a 3.18% landmark localization error, a 4% improvement over the existing error and A large number of landmarks were acquired and used as inputs for a lightweight neural network to estimate the gaze. We conducted a within-datasets evaluation on the MPIIGaze, which was obtained in a natural environment and achieved a state-of-the-art performance of 4.32 degrees, a 6% improvement over the existing performance.
Assuntos
Atenção , Redes Neurais de Computação , Face , Iris , IluminaçãoRESUMO
Digital pathology analysis using deep learning has been the subject of several studies. As with other medical data, pathological data are not easily obtained. Because deep learning-based image analysis requires large amounts of data, augmentation techniques are used to increase the size of pathological datasets. This study proposes a novel method for synthesizing brain tumor pathology data using a generative model. For image synthesis, we used embedding features extracted from a segmentation module in a general generative model. We also introduce a simple solution for training a segmentation model in an environment in which the masked label of the training dataset is not supplied. As a result of this experiment, the proposed method did not make great progress in quantitative metrics but showed improved results in the confusion rate of more than 70 subjects and the quality of the visual output.
Assuntos
Oligodendroglioma , Algoritmos , Encéfalo , Humanos , Processamento de Imagem Assistida por Computador/métodos , Oligodendroglioma/diagnóstico por imagem , Projetos de PesquisaRESUMO
Treatment of facial palsy is essential because neglecting this disorder can lead to serious sequelae and further damage. For an objective evaluation and consistent rehabilitation training program of facial palsy patients, a clinician's evaluation must be simultaneously performed alongside quantitative evaluation. Recent research has evaluated facial palsy using 68 facial landmarks as features. However, facial palsy has numerous features, whereas existing studies use relatively few landmarks; moreover, they do not confirm the degree of improvement in the patient. In addition, as the face of a normal person is not perfectly symmetrical, it must be compared with previous images taken at a different time. Therefore, we introduce three methods to numerically approach measuring the degree of facial palsy after extracting 478 3D facial landmarks from 2D RGB images taken at different times. The proposed numerical approach performs registration to compare the same facial palsy patients at different times. We scale landmarks by performing scale matching before global registration. After scale matching, coarse registration is performed with global registration. Point-to-plane ICP is performed using the transformation matrix obtained from global registration as the initial matrix. After registration, the distance symmetry, angular symmetry, and amount of landmark movement are calculated for the left and right sides of the face. The degree of facial palsy at a certain point in time can be approached numerically and can be compared with the degree of palsy at other times. For the same facial expressions, the degree of facial palsy at different times can be measured through distance and angle symmetry. For different facial expressions, the simultaneous degree of facial palsy in the left and right sides can be compared through the amount of landmark movement. Through experiments, the proposed method was tested using the facial palsy patient database at different times. The experiments involved clinicians and confirmed that using the proposed numerical approach can help assess the progression of facial palsy.
Assuntos
Paralisia Facial , Bases de Dados Factuais , Humanos , Imageamento Tridimensional/métodos , MovimentoRESUMO
Graph Neural Networks (GNNs) are neural networks that learn the representation of nodes and associated edges that connect it to every other node while maintaining graph representation. Graph Convolutional Neural Networks (GCNs), as a representative method in GNNs, in the context of computer vision, utilize conventional Convolutional Neural Networks (CNNs) to process data supported by graphs. This paper proposes a one-stage GCN approach for 3D object detection and poses estimation by structuring non-linearly distributed points of a graph. Our network provides the required details to analyze, generate and estimate bounding boxes by spatially structuring the input data into graphs. Our method proposes a keypoint attention mechanism that aggregates the relative features between each point to estimate the category and pose of the object to which the vertices of the graph belong, and also designs nine degrees of freedom of multi-object pose estimation. In addition, to avoid gimbal lock in 3D space, we use quaternion rotation, instead of Euler angle. Experimental results showed that memory usage and efficiency could be improved by aggregating point features from the point cloud and their neighbors in a graph structure. Overall, the system achieved comparable performance against state-of-the-art systems.
Assuntos
Gráficos por Computador , Imageamento Tridimensional , Redes Neurais de ComputaçãoRESUMO
In this paper, we propose a detection method for salient objects whose eyes are focused on gaze tracking; this method does not require a device in a single image. A network was constructed using Neg-Region Attention (NRA), which predicts objects with a concentrated line of sight using deep learning techniques. The existing deep learning-based method has an autoencoder structure, which causes feature loss during the encoding process of compressing and extracting features from the image and the decoding process of expanding and restoring. As a result, a feature loss occurs in the area of the object from the detection results, or another area is detected as an object. The proposed method, that is, NRA, can be used for reducing feature loss and emphasizing object areas with encoders. After separating positive and negative regions using the exponential linear unit activation function, converted attention was performed for each region. The attention method provided without using the backbone network emphasized the object area and suppressed the background area. In the experimental results, the proposed method showed higher detection results than the conventional methods.
Assuntos
Tecnologia de Rastreamento Ocular , Redes Neurais de ComputaçãoRESUMO
The adoption of low-crested and submerged structures (LCS) reduces the wave behind a structure, depending on the changes in the freeboard, and induces stable waves in the offshore. We aimed to estimate the wave transmission coefficient behind LCS structures to determine the feasible characteristics of wave mitigation. In addition, various empirical formulas based on regression analysis were proposed to quantitatively predict wave attenuation characteristics for field applications. However, inherent variability of wave attenuation causes the limitation of linear statistical approaches, such as linear regression analysis. Herein, to develop an optimization model for the hydrodynamic behavior of the LCS, we performed a comprehensive analysis of 10 types of machine learning models, which were compared and reviewed on the prediction accuracy with existing empirical formulas. We found that, among the 10 models, the gradient boosting model showed the highest prediction accuracy with MSE of 1.0 × 10-3, an index of agreement of 0.996, a scatter index of 0.065, and a correlation coefficient of 0.983, which indicates a performance improvement over the existing empirical formulas. In addition, based on a variable importance analysis using explainable artificial intelligence, we determined the significant importance of the input variable for the relative freeboard (RC/H0) and the relative freeboard to water depth ratio (RC/h), which confirms that the relative freeboard was the most dominant factor for influencing wave attenuation in the hydraulic behavior around the LCS. Thus, we concluded that the performance prediction method using a machine learning model can be applied to various predictive studies in the field of coastal engineering, deviating from existing empirical-based research.
Assuntos
Inteligência Artificial , Aprendizado de MáquinaRESUMO
Multi-object tracking is a significant field in computer vision since it provides essential information for video surveillance and analysis. Several different deep learning-based approaches have been developed to improve the performance of multi-object tracking by applying the most accurate and efficient combinations of object detection models and appearance embedding extraction models. However, two-stage methods show a low inference speed since the embedding extraction can only be performed at the end of the object detection. To alleviate this problem, single-shot methods, which simultaneously perform object detection and embedding extraction, have been developed and have drastically improved the inference speed. However, there is a trade-off between accuracy and efficiency. Therefore, this study proposes an enhanced single-shot multi-object tracking system that displays improved accuracy while maintaining a high inference speed. With a strong feature extraction and fusion, the object detection of our model achieves an AP score of 69.93% on the UA-DETRAC dataset and outperforms previous state-of-the-art methods, such as FairMOT and JDE. Based on the improved object detection performance, our multi-object tracking system achieves a MOTA score of 68.5% and a PR-MOTA score of 24.5% on the same dataset, also surpassing the previous state-of-the-art trackers.
RESUMO
RGB-D cameras have been commercialized, and many applications using them have been proposed. In this paper, we propose a robust registration method of multiple RGB-D cameras. We use a human body tracking system provided by Azure Kinect SDK to estimate a coarse global registration between cameras. As this coarse global registration has some error, we refine it using feature matching. However, the matched feature pairs include mismatches, hindering good performance. Therefore, we propose a registration refinement procedure that removes these mismatches and uses the global registration. In an experiment, the ratio of inliers among the matched features is greater than 95% for all tested feature matchers. Thus, we experimentally confirm that mismatches can be eliminated via the proposed method even in difficult situations and that a more precise global registration of RGB-D cameras can be obtained.
Assuntos
Monitorização Fisiológica , Calibragem , Humanos , MovimentoRESUMO
The efficiency of the metal detection method using deep learning with data obtained from multiple magnetic impedance (MI) sensors was investigated. The MI sensor is a passive sensor that detects metal objects and magnetic field changes. However, when detecting a metal object, the amount of change in the magnetic field caused by the metal is small and unstable with noise. Consequently, there is a limit to the detectable distance. To effectively detect and analyze this distance, a method using deep learning was applied. The detection performances of a convolutional neural network (CNN) and a recurrent neural network (RNN) were compared from the data extracted from a self-impedance sensor. The RNN model showed better performance than the CNN model. However, in the shallow stage, the CNN model was superior compared to the RNN model. The performance of a deep-learning-based (DLB) metal detection network using multiple MI sensors was compared and analyzed. The network was detected using long short-term memory and CNN. The performance was compared according to the number of layers and the size of the metal sheet. The results are expected to contribute to sensor-based DLB detection technology.
Assuntos
Redes Neurais de Computação , Impedância ElétricaRESUMO
The use of human gesturing to interact with devices such as computers or smartphones has presented several problems. This form of interaction relies on gesture interaction technology such as Leap Motion from Leap Motion, Inc, which enables humans to use hand gestures to interact with a computer. The technology has excellent hand detection performance, and even allows simple games to be played using gestures. Another example is the contactless use of a smartphone to take a photograph by simply folding and opening the palm. Research on interaction with other devices via hand gestures is in progress. Similarly, studies on the creation of a hologram display from objects that actually exist are also underway. We propose a hand gesture recognition system that can control the Tabletop holographic display based on an actual object. The depth image obtained using the latest Time-of-Flight based depth camera Azure Kinect is used to obtain information about the hand and hand joints by using the deep-learning model CrossInfoNet. Using this information, we developed a real time system that defines and recognizes gestures indicating left, right, up, and down basic rotation, and zoom in, zoom out, and continuous rotation to the left and right.
Assuntos
Gestos , Mãos , Holografia , Reconhecimento Automatizado de Padrão , Sistemas Computacionais , HumanosRESUMO
Vehicle detection is an important research area that provides background information for the diversity of unmanned-aerial-vehicle (UAV) applications. In this paper, we propose a vehicle-detection method using a convolutional-neural-network (CNN)-based object detector. We design our method, DRFBNet300, with a Deeper Receptive Field Block (DRFB) module that enhances the expressiveness of feature maps to detect small objects in the UAV imagery. We also propose the UAV-cars dataset that includes the composition and angular distortion of vehicles in UAV imagery to train our DRFBNet300. Lastly, we propose a Split Image Processing (SIP) method to improve the accuracy of the detection model. Our DRFBNet300 achieves 21 mAP with 45 FPS in the MS COCO metric, which is the highest score compared to other lightweight single-stage methods running in real time. In addition, DRFBNet300, trained on the UAV-cars dataset, obtains the highest AP score at altitudes of 20-50 m. The gap of accuracy improvement by applying the SIP method became larger when the altitude increases. The DRFBNet300 trained on the UAV-cars dataset with SIP method operates at 33 FPS, enabling real-time vehicle detection.
RESUMO
Recently, holographic display and computer-generated holograms calculated from real existing objects have been more actively investigated to support holographic video applications. In this paper, we proposed a method of generating 360-degree color holograms of real 3D objects in an efficient manner. 360-degree 3D images are generated using the actual 3D image acquisition system consisting of a depth camera and a turntable and intermediate view generation. Then, 360-degree color holograms are calculated using a viewing-window-based computer-generated hologram. We confirmed that floating 3D objects are faithfully reconstructed around a 360-degree direction using our 360-degree tabletop color holographic display.
RESUMO
Understanding the growth of graphene over Si species is becoming ever more important as the huge potential for the combination of these two materials becomes more apparent, not only for device fabrication but also in energy applications, particularly in Li-ion batteries. Thus, the drive for the direct fabrication of graphene over Si is crucial because indirect approaches, by their very nature, require processing steps that, in general, contaminate, damage, and are costly. In this work, the direct chemical vapor deposition growth of few-layer graphene over Si nanoparticles is systematically explored through experiment and theory with the use of a reducer, H2 or the use of a mild oxidant, CO2 combined with CH4 . Unlike the case of CH4 , with the use of CO2 as a mild oxidant in the reaction, the graphene layers form neatly over the surface and encapsulate the Si particles. SiC formation is also prevented. These structures show exceptionally good electrochemical performance as high capacity anodes for lithium-ion batteries. Density functional theory studies show the presence of CO2 not only prevents SiC formation but helps enhance the catalytic activity of the particles by maintaining an SiOx surface. In addition, CO2 can enhance graphitization.
RESUMO
Successful commercialization of holographic printers based on holographic stereograms requires a tool for their numerical replaying and quality assessment before the time-consuming and expensive process of holographic recording. A holographic stereogram encodes 2D images of a 3D scene that are incoherently captured from multiple perspectives and rearranged before recording. This study presents a simulator which builds a full parallax and full color white light viewable holographic stereogram from the perspective images captured by a virtual recentering camera with its further numerical reconstruction for any viewer location. By tracking all steps from acquisition to recording, the simulator allows for analysis of radial distortions caused by the optical elements used at the recording stage. Numerical experiments conducted at increasing degree of pincushion distortion proved its insignificant influence on the reconstructed images in all practical cases by using a peak signal-to-noise ratio and the structural similarity as an image quality metrics.
RESUMO
When an artificial structure is built in a river, the river changes significantly in water quality and hydraulic properties. In this study, the effects of the weirs constructed in the middle section of a river as a four major rivers restoration project in Korea on water quality and hydrological characteristics were analyzed. For multi-dimensional data analysis, a self-organizing map was applied, and statistical techniques including analysis of variation were used. As a result of analysis, the cross-sectional area of the river increased significantly after the construction of the weir compared to before the construction of the weir, and the flow velocity decreased at a statistically significant level. In the case of water quality, nitrogen, phosphorus, and suspended solids tended to improve after weir construction, and chlorophyll-a and bacteria tended to deteriorate. Some water quality parameters such as chlorophyll-a were also affected by seasonal influences. In order to improve the water quality deteriorated by the construction of the weir, it is necessary to consider how to improve the flow velocity of the river through partial opening or operation of the weir. In addition, in order to determine the effect of sedimentation of particulate matter due to the decrease in flow rate, it is necessary to conduct investigations on sediments around weirs in the future. PRACTITIONER POINTS: Compared to before the construction of the weir, there was no significant change in the flow rate of the river after the construction of the weir. In the case of chlorophyll-a and bacteria, the water quality was deteriorated after weir construction. To improve the deteriorated water quality, it is required to consider the fundamental management of each pollutant source and the flexible operation of both weirs. For some improved water quality parameters, further research is needed to determine whether these improvements are directly attributable to the construction of a weir.
Assuntos
Rios , Qualidade da Água , Rios/química , Hidrologia , República da Coreia , Clorofila A/análise , Monitoramento Ambiental , Clorofila/análiseRESUMO
This study aimed to investigate the difference in facial reanimation surgery using functional gracilis muscle transfer between the masseteric nerve alone and its combined use with cross face nerve graft (CFNG), which has not been explored before. A novel analysis method based on artificial intelligence (AI) was employed to compare the outcomes of the two approaches. Using AI, 3-dimensional facial landmarks were extracted from 2-dimensional photographs, and distance and angular symmetry scores were calculated. The patients were divided into two groups, with Group 1 undergoing one-stage CFNG and masseteric nerve dual innervation, and Group 2 receiving only masseteric nerve. The symmetry scores were obtained before and 1 year after surgery to assess the degree of change. Of the 35 patients, Group 1 included 13 patients, and Group 2 included 22 patients. The analysis revealed that, in the resting state, the change in the symmetry score of the mouth corner showed distance symmetry (2.55 ± 2.94, 0.52 ± 2.75 for Group 1 and Group 2, respectively, p = 0.048) and angle symmetry (1.21 ± 1.43, 0.02 ± 0.22 for Group 1 and Group 2, respectively, p = 0.001), which were significantly improved in Group 1, indicating a more symmetric pattern after surgery. In the smile state, only the angle symmetry was improved more symmetrically in Group 1 (3.20 ± 2.38, 1.49 ± 2.22 for Group 1 and Group 2, respectively, p = 0.041). Within the limitations of the study it seems that this new analysis method enabled a more accurate numerical symmetry score to be obtained, and while the degree of mouth corner excursion was sufficient with only the masseteric nerve, accompanying CFNG led to further improvement in symmetry in the resting state.