RESUMO
Identifying violent activities is important for ensuring the safety of society. Although the Transformer model contributes significantly to the field of behavior recognition, it often requires a substantial volume of data to perform well. Since existing datasets on violent behavior are currently lacking, it will be a challenge for Transformers to identify violent behavior with insufficient datasets. Additionally, Transformers are known to be computationally heavy and can sometimes overlook temporal features. To overcome these issues, an architecture named MLP-Mixer can be used to achieve comparable results with a smaller dataset. In this research, a special type of dataset to be fed into the MLP-Mixer called a sequential image collage (SIC) is proposed. This dataset is created by aggregating frames of video clips into image collages sequentially for the model to better understand the temporal features of violent behavior in videos. Three different public datasets, namely, the dataset of National Hockey League hockey fights, the dataset of smart-city CCTV violence detection, and the dataset of real-life violence situations were used to train the model. The results of the experiments proved that the model trained using the proposed SIC is capable of achieving high performance in violent behavior recognition with fewer parameters and FLOPs needed compared to other state-of-the-art models.
RESUMO
Recently, the deep learning (DL) dimension of artificial intelligence has received much attention from biochemical researchers and thus has gradually become the key approach adopted in the area of biosensing applications. Studies have shown that the use of DL techniques for sensing can not only shorten the time of data analysis but also significantly increase the accuracy of data analysis and prediction, resulting in the performance improvement of biosensing systems in comparison to conventional methods. However, obtaining reliable equilibrium and rate constants of biomolecular interactions during the detection process remains difficult and time-consuming to date. In this study, we propose a transformed model based on the deep transfer learning and sequence-to-sequence autoencoder that can successfully transfer the SPR sensorgram to the protein-binding constants, that is, the association rate constant (ka) and dissociation rate constant (kd), which provide crucial information to understand the mechanisms of drug action and the functional structures of biomolecules. Experimentally, we first trained and tested the pre-trained model using the Langmuir model which generated ideal SPR sensorgrams and then we fine-tuned the pre-trained model through the augmented SPR sensorgrams which were synthesized by using the synthesized minority oversampling technique (SMOTE) through the moderate-scale experiment. Next, the fine-tuned model was inputted with a short experimental SPR sensorgram that only needs 110 s, and the sensorgram was directly transformed into a reconstructed ideal sensorgram. Finally, the binding kinetic constants, that is, ka and kd, as outputs, were obtained through fitting the reconstructed ideal sensorgram. The results showed that the prediction errors of ka and kd obtained by our model were less than 12 and 24%, respectively. Based on the convenience, accuracy, and reliability of the proposed DL approach, we believe our strategy significantly boosts the feasibility to monitor the binding affinity of antibodies online during production.
Assuntos
Inteligência Artificial , Aprendizado Profundo , Cinética , Ligação Proteica , Reprodutibilidade dos TestesRESUMO
Target tracking is a critical technique for localization in an indoor environment. Current target-tracking methods suffer from high overhead, high latency, and blind spots issues due to a large amount of data needing to be collected or trained. On the other hand, a lightweight tracking method is preferred in many cases instead of just pursuing accuracy. For this reason, in this paper, we propose a Wi-Fi-enabled Infrared-like Device-free (WIDE) method for target tracking to realize a lightweight target-tracking method. We first analyze the impact of target movement on the physical layer of the wireless link and establish a near real-time model between the Channel State Information (CSI) and human motion. Secondly, we make full use of the network structure formed by a large number of wireless devices already deployed in reality to achieve the goal. We validate the WIDE method in different environments. Extensive evaluation results show that the WIDE method is lightweight and can track targets rapidly as well as achieve satisfactory tracking results.
Assuntos
Movimento , Humanos , Movimento (Física)RESUMO
This study shows the problem of power saving mechanism (PSM) that sleep intervals of uplink (UL) connections do not synchronize with sleep intervals of downlink (DL) connections. That is, the energy of a mobile station (MS) is not really saved if the DL connections are in the sleep mode while the UL connections are in normal mode, and vice versa. To avoid the asynchronism of power saving (PS) between UL and DL connections, we invent a mechanism of DL connections regulating UL connections, called DL and UL Alignment (DUAL) scheme, to improve the energy efficiency for PS. Considering that the buffer size of MS is limited, DUAL uses the mean packet arrival rate of UL λ u and a relatively safe threshold of buffer size Q T as the parameters to estimate the maximum allowable waiting time to align the UL with the DL connections. To analyze the performance of DUAL, a system model of PS is proposed to evaluate the performance of DUAL under different conditions. The correctness of performance analysis of DUAL is validated by using simulation with realistic parameters. Numerical experiments show that DUAL improves the energy conservation significantly when UL traffic is greater than DL traffic.
Assuntos
Modelos TeóricosRESUMO
BACKGROUND AND OBJECTIVE: Symptom descriptions by ordinary people are often inaccurate or vague when seeking medical advice, which often leads to inaccurate preliminary clinical diagnoses. To address this issue, we propose a deep learning model named the knowledgeable diagnostic transformer (KDT) for the natural language processing (NLP)-based preliminary clinical diagnoses. METHODS: The KDT extracts symptom-disease relation triples (h,r,t) from patient symptom descriptions by using a proposed bipartite medical knowledge graph (bMKG). To avoid too many relation triples causing the knowledge noise issue, we propose a knowledge inclusion-exclusion approach (KIA) to eliminate undesirable triples (a knowledge filtering layer). Next, we combine token embedding techniques with the transformer model to predict the diseases that patients may encounter. RESULTS: To train the KDT, a medical diagnosis question-answering dataset (named MDQA dataset) containing large-scale, high-quality questions (patient syndrome description) and answering (diagnosis) corpora with 2.6M entries (1.07GB in size) in Mandarin was built. We also train the KDT with the National Institutes of Health (NIH) English dataset (MedQuAD). The KDT marks a transformative approach by achieving a remarkable accuracy of 99% for different evaluation metrics when compared with the baseline transformers used for the NLP-based preliminary clinical diagnoses approaches. CONCLUSIONS: In essence, our study not only demonstrates the effectiveness of the KDT in enhancing diagnostic precision but also underscores its potential to revolutionize the field of preliminary clinical diagnoses. By harnessing the power of knowledge-based approaches and advanced NLP techniques, we have paved the way for more accurate and reliable diagnoses, ultimately benefiting both healthcare providers and patients. The KDT has the potential to significantly reduce misdiagnoses and improve patient outcomes, marking a pivotal advancement in the realm of medical diagnostics.
Assuntos
Benchmarking , Processamento de Linguagem Natural , Humanos , Bases de Conhecimento , Idioma , Encaminhamento e Consulta , Estados UnidosRESUMO
BACKGROUND: Recently, as a relatively novel technology, artificial intelligence (especially in the deep learning fields) has received more and more attention from researchers and has successfully been applied to many biomedical domains. Nonetheless, just a few research works use deep learning skills to predict the cardiac resynchronization therapy (CRT)-response of heart failure patients. OBJECTIVE: We try to use the deep learning-based technique to construct a model which is used to predict the CRT response of patients with high prediction accuracy, precision, and sensitivity. METHODS: Using two-dimensional echocardiographic strain traces from 131 patients, we pre-processed the data and synthesized 2,000 model inputs through the synthetic minority oversampling technique (SMOTE). These inputs trained and optimized deep neural networks (DNN) and one-dimensional convolution neural networks (1D-CNN). Visualization of prediction results was performed using t-distributed stochastic neighbor embedding (t-SNE), and model performance was evaluated using accuracy, precision, sensitivity, F1 score, and specificity. Variable importance was assessed using Shapley additive explanations (SHAP) analysis. RESULTS: Both the optimal DNN and 1D-CNN models demonstrated exceptional predictive performance, with prediction accuracy, precision, and sensitivity all around 90%. Furthermore, the area under the receiver operating characteristic curve (AUROC) of the optimal 1D-CNN and DNN models achieved 0.8734 and 0.9217, respectively. Crucially, the most significant input variables for both models align well with clinical experience, further corroborating their robustness and applicability in real-world settings. CONCLUSIONS: We believe that both the DL models could be an auxiliary to help in treatment response prediction for doctors because of the excellent prediction performance and the convenience of obtaining input data to predict the CRT response of patients clinically.
RESUMO
This study investigates how to adjust the transmit power of femto base station (FBS) to mitigate interference problems between the FBSs and mobile users (MUs) in the 2-tier heterogeneous femtocell networks. A common baseline of deploying the FBS to increase the indoor access bandwidth requires that the FBS operation will not affect outdoor MUs operation with their quality-of-service (QoS) requirements. To tackle this technical problem, an FBS transmit power adjustment (FTPA) algorithm is proposed to adjust the FBS transmit power (FTP) to avoid unwanted cochannel interference (CCI) with the neighboring MUs in downlink transmission. FTPA reduces the FTP to serve its femto users (FUs) according to the QoS requirements of the nearest neighboring MUs to the FBS so that the MU QoS requirement is guaranteed. Simulation results demonstrate that FTPA can achieve a low MU outage probability as well as serve FUs without violating the MU QoS requirements. Simulation results also reveal that FTPA has better performance on voice and video services which are the major trend of future multimedia communication in the NGN.
Assuntos
Algoritmos , Telefone Celular/instrumentação , Fontes de Energia Elétrica , Eletrônica/métodos , Modelos Teóricos , Processamento de Sinais Assistido por Computador , Tecnologia sem Fio/instrumentaçãoRESUMO
In the emerging technology, the generative aversive networks (GANs), randomness, and unpredictability of inputting noises are the keys to the uniqueness, diversity, robustness, and security of the generated images. Compared with deterministic software-based noise generation, hardware-based noise generation introduces physical entropy sources, such as electronic and photonic noises, to add unpredictability. In this study, bimode Bi2O2Se-based noise generators have been demonstrated for the application of GANs. Harnessing its ultrahigh carrier mobility, excellent air stability, marvelous optoelectronic performance, as well as the unique surface resistive switching effect and defect locations in the energy diagram, Bi2O2Se provides a good material platform to easily integrate with multiple device architectures for generating noises in different physical sources. The noise of the black current mode in a photodetector architecture and the random telegraph noise in a memristor mode were measured, characterized, compared, and analyzed. A method of Markov chain equipped with K-means clustering was carried out to calculate the discrete noise states and the transition probability matrix between them. To evaluate the generated properties of the GANs based on the hardware noise source, the inception score and Fréchet inception distance were evaluated.
RESUMO
In the fourth generation or next generation networks, services of non-real-time variable bit rate (NRT-VBR) and best effort (BE) will dominate over 85% of the total traffic in the networks. In this paper, we study the power saving mechanism of NRT-VBR and BE services for mobile handsets (MHs) to prolong their battery lifetime (i.e., the sustained operation duration) in the fourth generation networks. Because the priority of NRT-VBR and BE is lower than that of real-time VBR (RT-VBR) or guaranteed bit rate (GBR) services, we investigate an extended sleep mode for lower priority services (e.g., NRT-VBR and BE) in an MH to conserve the energy. The extended sleep mode is used when the MH wakes up from the sleep mode but it cannot obtain the bandwidth from base station (BS). The proposed mechanism, named extra power saving scheme (EPSS), uses the M/M/k/k Markovian queuing model to estimate the extended sleep duration to let MHs conserve their battery energy when the networks traffic is congested. To study the performance of EPSS, an accurate analysis model of energy is presented and validated by taking a series of simulations. Numerical experiments show that EPSS can achieve 43% extra energy conservation at most when downlink resource is saturated. We conclude that the energy of MHs can be conserved further by applying EPSS when the traffic load is saturated. The effect of energy saving becomes more obvious when the portion of NRT-VBR and BE services is greater than that of RT-VBR and GBR services.