RESUMO
Reconstruction-based and prediction-based approaches are widely used for video anomaly detection (VAD) in smart city surveillance applications. However, neither of these approaches can effectively utilize the rich contextual information that exists in videos, which makes it difficult to accurately perceive anomalous activities. In this paper, we exploit the idea of a training model based on the "Cloze Test" strategy in natural language processing (NLP) and introduce a novel unsupervised learning framework to encode both motion and appearance information at an object level. Specifically, to store the normal modes of video activity reconstructions, we first design an optical stream memory network with skip connections. Secondly, we build a space-time cube (STC) for use as the basic processing unit of the model and erase a patch in the STC to form the frame to be reconstructed. This enables a so-called "incomplete event (IE)" to be completed. On this basis, a conditional autoencoder is utilized to capture the high correspondence between optical flow and STC. The model predicts erased patches in IEs based on the context of the front and back frames. Finally, we employ a generating adversarial network (GAN)-based training method to improve the performance of VAD. By distinguishing the predicted erased optical flow and erased video frame, the anomaly detection results are shown to be more reliable with our proposed method which can help reconstruct the original video in IE. Comparative experiments conducted on the benchmark UCSD Ped2, CUHK Avenue, and ShanghaiTech datasets demonstrate AUROC scores reaching 97.7%, 89.7%, and 75.8%, respectively.
RESUMO
OBJECTIVE: To determine the association of dryness of eyes with rheumatoid arthritis severity. METHODS: The cross-sectional, observational study was conducted at the Jinnah Medical College Hospital, Karachi, from December 2020 to May 2021, and comprised adult patients of either gender with rheumatoid arthritis who were diagnosed on the basis of clinical and serological investigations. Data was collected using a structured pre-tested questionnaire. Ocular Surface Disease Index questionnaires with Tear Film Breakup Time were used to assess the severity of dry eyes. Disease Activity Score-28 with erythrocyte sedimentation rate was used to assess the severity of rheumatoid arthritis. Association between the two was explored. Data was analysed using SPSS 22. RESULTS: Of the 61 patients, 52(85.2%) were females and 9(14.8%) were males. The overall mean age was 41.7±12.8 years, with 4(6.6%) aged <20 years, 26(42.6%) aged 21-40 years, 28(45.9%) aged 41-60 years and 3(4.9%) aged >60years. Further, 46(75.4%) subjects had sero-positive rheumatoid arthritis, 25(41%) had high severity, 30(49.2%) had severe Occular Surface Density Index score and 36(59%) had decreased Tear Film Breakup Time. Logistic Regression analysis showed there were 5.45 times higher odds of having severe disease among the people with Occular Surface Density Index score >33 (p=0.003). In patients with positive Tear Film Breakup Time, there were 6.25 higher odds of having increased disease activity score (p=0.001). CONCLUSIONS: Disease activity scores of rheumatoid arthritis were found to have strong association with dryness of eyes, high Ocular Surface Disease Index score and increased erythrocyte sedimentation rate.
Assuntos
Artrite Reumatoide , Síndromes do Olho Seco , Ceratoconjuntivite Seca , Adulto , Feminino , Masculino , Humanos , Pessoa de Meia-Idade , Estudos Transversais , Síndromes do Olho Seco/diagnóstico , Síndromes do Olho Seco/epidemiologia , Artrite Reumatoide/diagnóstico , Artrite Reumatoide/epidemiologia , Sedimentação SanguíneaRESUMO
The development of reinforced polymer composite materials has had a significant influence on the challenging problem of shielding against high-energy photons, particularly X-rays and γ-rays in industrial and healthcare facilities. Heavy materials' shielding characteristics hold a lot of potential for bolstering concrete chunks. The mass attenuation coefficient is the main physical factor that is utilized to measure the narrow beam γ-ray attenuation of various combinations of magnetite and mineral powders with concrete. Data-driven machine learning approaches can be investigated to assess the gamma-ray shielding behavior of composites as an alternative to theoretical calculations, which are often time- and resource-intensive during workbench testing. We developed a dataset using magnetite and seventeen mineral powder combinations at different densities and water/cement ratios, exposed to photon energy ranging from 1 to 1006 kiloelectronvolt (KeV). The National Institute of Standards and Technology (NIST) photon cross-section database and software methodology (XCOM) was used to compute the concrete's γ-ray shielding characteristics (LAC). The XCOM-calculated LACs and seventeen mineral powders were exploited using a range of machine learning (ML) regressors. The goal was to investigate whether the available dataset and XCOM-simulated LAC can be replicated using ML techniques in a data-driven approach. The minimum absolute error (MAE), root mean square error (RMSE), and R2score were employed to assess the performance of our proposed ML models, specifically a support vector machine (SVM), 1d-convolutional neural network (CNN), multi-Layer perceptrons (MLP), linear regressor, decision tree, hierarchical extreme machine learning (HELM), extreme learning machine (ELM), and random forest networks. Comparative results showed that our proposed HELM architecture outperformed state-of-the-art SVM, decision tree, polynomial regressor, random forest, MLP, CNN, and conventional ELM models. Stepwise regression and correlation analysis were further used to evaluate the forecasting capability of ML techniques compared to the benchmark XCOM approach. According to the statistical analysis, the HELM model showed strong consistency between XCOM and predicted LAC values. Additionally, the HELM model performed better in terms of accuracy than the other models used in this study, yielding the highest R2score and the lowest MAE and RMSE.
RESUMO
BACKGROUND: Text mining in the biomedical field has received much attention and regarded as the important research area since a lot of biomedical data is in text format. Topic modeling is one of the popular methods among text mining techniques used to discover hidden semantic structures, so called topics. However, discovering topics from biomedical data is a challenging task due to the sparsity, redundancy, and unstructured format. METHODS: In this paper, we proposed a novel multiple kernel fuzzy topic modeling (MKFTM) technique using fusion probabilistic inverse document frequency and multiple kernel fuzzy c-means clustering algorithm for biomedical text mining. In detail, the proposed fusion probabilistic inverse document frequency method is used to estimate the weights of global terms while MKFTM generates frequencies of local and global terms with bag-of-words. In addition, the principal component analysis is applied to eliminate higher-order negative effects for term weights. RESULTS: Extensive experiments are conducted on six biomedical datasets. MKFTM achieved the highest classification accuracy 99.04%, 99.62%, 99.69%, 99.61% in the Muchmore Springer dataset and 94.10%, 89.45%, 92.91%, 90.35% in the Ohsumed dataset. The CH index value of MKFTM is higher, which shows that its clustering performance is better than state-of-the-art topic models. CONCLUSION: We have confirmed from results that proposed MKFTM approach is very efficient to handles to sparsity and redundancy problem in biomedical text documents. MKFTM discovers semantically relevant topics with high accuracy for biomedical documents. Its gives better results for classification and clustering in biomedical documents. MKFTM is a new approach to topic modeling, which has the flexibility to work with a variety of clustering methods.
Assuntos
Algoritmos , Mineração de Dados , Análise por Conglomerados , Mineração de Dados/métodos , SemânticaRESUMO
Cenani-Lenz syndrome (CLS) is a rare autosomal-recessive congenital disorder affecting development of distal limbs. It is characterized mainly by syndactyly and/or oligodactyly, renal anomalies, and characteristic facial features. Mutations in the LRP4 gene, located on human chromosome 11p11.2-q13.1, causes the CLS. The gene LRP4 encodes a low-density lipoprotein receptor-related protein-4, which mediates SOST-dependent inhibition of bone formation and Wnt signaling. In the study, presented here, three families of Pakistani origin, segregating CLS in the autosomal recessive manner were clinically and genetically characterized. In two families (A and B), microsatellite-based homozygosity mapping followed by Sanger sequencing identified a novel homozygous missense variant [NM_002334.3: c.295G>C; p.(Asp99His)] in the LRP4 gene. In the third family C, exome sequencing revealed a second novel homozygous missense variant [NM_002334.3: c.1633C>T; p.(Arg545Trp)] in the same gene. To determine the functional relevance of these variants, we tested their ability to inhibit canonical WNT signaling in a luciferase assay. Wild type LRP4 was able to inhibit LRP6-dependent WNT signaling robustly. The two mutants p.(Asp99His) and p.(Arg545Trp) inhibited WNT signaling less effectively, suggesting they reduced LRP4 function.
Assuntos
Proteínas Relacionadas a Receptor de LDL , Sindactilia , Humanos , Proteínas Relacionadas a Receptor de LDL/genética , Masculino , Linhagem , Sindactilia/genética , Via de Sinalização Wnt/genéticaRESUMO
The Covid-19 pandemic is the defining global health crisis of our time. Chest X-Rays (CXR) have been an important imaging modality for assisting in the diagnosis and management of hospitalised Covid-19 patients. However, their interpretation is time intensive for radiologists. Accurate computer aided systems can facilitate early diagnosis of Covid-19 and effective triaging. In this paper, we propose a fuzzy logic based deep learning (DL) approach to differentiate between CXR images of patients with Covid-19 pneumonia and with interstitial pneumonias not related to Covid-19. The developed model here, referred to as CovNNet, is used to extract some relevant features from CXR images, combined with fuzzy images generated by a fuzzy edge detection algorithm. Experimental results show that using a combination of CXR and fuzzy features, within a deep learning approach by developing a deep network inputed to a Multilayer Perceptron (MLP), results in a higher classification performance (accuracy rate up to 81%), compared to benchmark deep learning approaches. The approach has been validated through additional datasets which are continously generated due to the spread of the virus and would help triage patients in acute settings. A permutation analysis is carried out, and a simple occlusion methodology for explaining decisions is also proposed. The proposed pipeline can be easily embedded into present clinical decision support systems.
RESUMO
This study aimed at investigating the reasons of relapse and patterns of drug use among the substance users in Bangladesh. We have conducted a descriptive type of cross-sectional study among the relapse cases of substances users in the whole of Bangladesh. Concerning the reasons for relapse after taking treatment; family unrest (29.5%), peer pressure (27.4%), to reduce depression (24.8%) and craving for drugs (24.3%) were the most frequent. Amphetamine was reported to be the most used drug (76.1%, n = 693), followed by cannabis (75%, n = 683) and alcohol (54.3%, n= 495). Further extensive studies are also needed to explore the association.
Assuntos
Usuários de Drogas , Transtornos Relacionados ao Uso de Substâncias , Bangladesh , Estudos Transversais , Humanos , Recidiva , Transtornos Relacionados ao Uso de Substâncias/epidemiologiaRESUMO
BACKGROUND: Global efforts toward the development and deployment of a vaccine for COVID-19 are rapidly advancing. To achieve herd immunity, widespread administration of vaccines is required, which necessitates significant cooperation from the general public. As such, it is crucial that governments and public health agencies understand public sentiments toward vaccines, which can help guide educational campaigns and other targeted policy interventions. OBJECTIVE: The aim of this study was to develop and apply an artificial intelligence-based approach to analyze public sentiments on social media in the United Kingdom and the United States toward COVID-19 vaccines to better understand the public attitude and concerns regarding COVID-19 vaccines. METHODS: Over 300,000 social media posts related to COVID-19 vaccines were extracted, including 23,571 Facebook posts from the United Kingdom and 144,864 from the United States, along with 40,268 tweets from the United Kingdom and 98,385 from the United States from March 1 to November 22, 2020. We used natural language processing and deep learning-based techniques to predict average sentiments, sentiment trends, and topics of discussion. These factors were analyzed longitudinally and geospatially, and manual reading of randomly selected posts on points of interest helped identify underlying themes and validated insights from the analysis. RESULTS: Overall averaged positive, negative, and neutral sentiments were at 58%, 22%, and 17% in the United Kingdom, compared to 56%, 24%, and 18% in the United States, respectively. Public optimism over vaccine development, effectiveness, and trials as well as concerns over their safety, economic viability, and corporation control were identified. We compared our findings to those of nationwide surveys in both countries and found them to correlate broadly. CONCLUSIONS: Artificial intelligence-enabled social media analysis should be considered for adoption by institutions and governments alongside surveys and other conventional methods of assessing public attitude. Such analyses could enable real-time assessment, at scale, of public confidence and trust in COVID-19 vaccines, help address the concerns of vaccine sceptics, and help develop more effective policies and communication strategies to maximize uptake.
Assuntos
Inteligência Artificial , Vacinas contra COVID-19/administração & dosagem , Opinião Pública , Mídias Sociais/estatística & dados numéricos , Vacinação/psicologia , COVID-19/epidemiologia , COVID-19/prevenção & controle , COVID-19/psicologia , Humanos , Processamento de Linguagem Natural , Aceitação pelo Paciente de Cuidados de Saúde , SARS-CoV-2/isolamento & purificação , Reino Unido/epidemiologia , Estados Unidos/epidemiologiaRESUMO
BACKGROUND: The emergence of SARS-CoV-2 in late 2019 and its subsequent spread worldwide continues to be a global health crisis. Many governments consider contact tracing of citizens through apps installed on mobile phones as a key mechanism to contain the spread of SARS-CoV-2. OBJECTIVE: In this study, we sought to explore the suitability of artificial intelligence (AI)-enabled social media analyses using Facebook and Twitter to understand public perceptions of COVID-19 contact tracing apps in the United Kingdom. METHODS: We extracted and analyzed over 10,000 relevant social media posts across an 8-month period, from March 1 to October 31, 2020. We used an initial filter with COVID-19-related keywords, which were predefined as part of an open Twitter-based COVID-19 dataset. We then applied a second filter using contract tracing app-related keywords and a geographical filter. We developed and utilized a hybrid, rule-based ensemble model, combining state-of-the-art lexicon rule-based and deep learning-based approaches. RESULTS: Overall, we observed 76% positive and 12% negative sentiments, with the majority of negative sentiments reported in the North of England. These sentiments varied over time, likely influenced by ongoing public debates around implementing app-based contact tracing by using a centralized model where data would be shared with the health service, compared with decentralized contact-tracing technology. CONCLUSIONS: Variations in sentiments corroborate with ongoing debates surrounding the information governance of health-related information. AI-enabled social media analysis of public attitudes in health care can help facilitate the implementation of effective public health campaigns.
Assuntos
Inteligência Artificial , COVID-19/epidemiologia , Busca de Comunicante/métodos , Aplicativos Móveis , Mídias Sociais , Humanos , Opinião Pública , SARS-CoV-2/isolamento & purificaçãoRESUMO
Object detection has wide applications in intelligent systems and sensor applications. Compared with two stage detectors, recent one stage counterparts are capable of running more efficiently with comparable accuracy, which satisfy the requirement of real-time processing. To further improve the accuracy of one stage single shot detector (SSD), we propose a novel Multi-Path fusion Single Shot Detector (MPSSD). Different from other feature fusion methods, we exploit the connection among different scale representations in a pyramid manner. We propose feature fusion module to generate new feature pyramids based on multiscale features in SSD, and these pyramids are sent to our pyramid aggregation module for generating final features. These enhanced features have both localization and semantics information, thus improving the detection performance with little computation cost. A series of experiments on three benchmark datasets PASCAL VOC2007, VOC2012, and MS COCO demonstrate that our approach outperforms many state-of-the-art detectors both qualitatively and quantitatively. In particular, for input images with size 512 × 512, our method attains mean Average Precision (mAP) of 81.8% on VOC2007 test, 80.3% on VOC2012 test, and 33.1% mAP on COCO test-dev 2015.
RESUMO
The health status of an elderly person can be identified by examining the additive effects of aging along with disease linked to it and can lead to 'unstable incapacity'. This health status is determined by the apparent decline of independence in activities of daily living (ADLs). Detecting ADLs provides possibilities of improving the home life of elderly people as it can be applied to fall detection systems. This paper presents fall detection in elderly people based on radar image classification by examining their daily routine activities, using radar data that were previously collected for 99 volunteers. Machine learning techniques are used classify six human activities, namely walking, sitting, standing, picking up objects, drinking water and fall events. Different machine learning algorithms, such as random forest, K-nearest neighbours, support vector machine, long short-term memory, bi-directional long short-term memory and convolutional neural networks, were used for data classification. To obtain optimum results, we applied data processing techniques, such as principal component analysis and data augmentation, to the available radar images. The aim of this paper is to improve upon the results achieved using a publicly available dataset to further improve upon research of fall detection systems. It was found out that the best results were obtained using the CNN algorithm with principal component analysis and data augmentation together to obtain a result of 95.30% accuracy. The results also demonstrated that principal component analysis was most beneficial when the training data were expanded by augmentation of the available data. The results of our proposed approach, in comparison to the state of the art, have shown the highest accuracy.
Assuntos
Atividades Cotidianas , Radar , Idoso , Algoritmos , Humanos , Aprendizado de Máquina , Redes Neurais de Computação , CaminhadaRESUMO
With the emerging growth of digital data in information systems, technology faces the challenge of knowledge prevention, ownership rights protection, security, and privacy measurement of valuable and sensitive data. On-demand availability of various data as services in a shared and automated environment has become a reality with the advent of cloud computing. The digital fingerprinting technique has been adopted as an effective solution to protect the copyright and privacy of digital properties from illegal distribution and identification of malicious traitors over the cloud. Furthermore, it is used to trace the unauthorized distribution and the user of multimedia content distributed through the cloud. In this paper, we propose a novel fingerprinting technique for the cloud environment to protect numeric attributes in relational databases for digital privacy management. The proposed solution with the novel fingerprinting scheme is robust and efficient. It can address challenges such as embedding secure data over the cloud, essential to secure relational databases. The proposed technique provides a decoding accuracy of 100%, 90%, and 40% for 10% to 30%, 40%, and 50% of deleted records.
Assuntos
Segurança Computacional , Registros Eletrônicos de Saúde , Computação em Nuvem , Confidencialidade , Privacidade , TecnologiaRESUMO
Diabetic retinopathy (DR) is an eye disease that alters the blood vessels of a person suffering from diabetes. Diabetic macular edema (DME) occurs when DR affects the macula, which causes fluid accumulation in the macula. Efficient screening systems require experts to manually analyze images to recognize diseases. However, due to the challenging nature of the screening method and lack of trained human resources, devising effective screening-oriented treatment is an expensive task. Automated systems are trying to cope with these challenges; however, these methods do not generalize well to multiple diseases and real-world scenarios. To solve the aforementioned issues, we propose a new method comprising two main steps. The first involves dataset preparation and feature extraction and the other relates to improving a custom deep learning based CenterNet model trained for eye disease classification. Initially, we generate annotations for suspected samples to locate the precise region of interest, while the other part of the proposed solution trains the Center Net model over annotated images. Specifically, we use DenseNet-100 as a feature extraction method on which the one-stage detector, CenterNet, is employed to localize and classify the disease lesions. We evaluated our method over challenging datasets, namely, APTOS-2019 and IDRiD, and attained average accuracy of 97.93% and 98.10%, respectively. We also performed cross-dataset validation with benchmark EYEPACS and Diaretdb1 datasets. Both qualitative and quantitative results demonstrate that our proposed approach outperforms state-of-the-art methods due to more effective localization power of CenterNet, as it can easily recognize small lesions and deal with over-fitted training data. Our proposed framework is proficient in correctly locating and classifying disease lesions. In comparison to existing DR and DME classification approaches, our method can extract representative key points from low-intensity and noisy images and accurately classify them. Hence our approach can play an important role in automated detection and recognition of DR and DME lesions.
Assuntos
Aprendizado Profundo , Diabetes Mellitus , Retinopatia Diabética , Edema Macular , Retinopatia Diabética/diagnóstico por imagem , Humanos , Edema Macular/diagnóstico por imagemRESUMO
Transcranial magnetic stimulation (TMS) excites neurons in the cortex, and neural activity can be simultaneously recorded using electroencephalography (EEG). However, TMS-evoked EEG potentials (TEPs) do not only reflect transcranial neural stimulation as they can be contaminated by artifacts. Over the last two decades, significant developments in EEG amplifiers, TMS-compatible technology, customized hardware and open source software have enabled researchers to develop approaches which can substantially reduce TMS-induced artifacts. In TMS-EEG experiments, various physiological and external occurrences have been identified and attempts have been made to minimize or remove them using online techniques. Despite these advances, technological issues and methodological constraints prevent straightforward recordings of early TEPs components. To the best of our knowledge, there is no review on both TMS-EEG artifacts and EEG technologies in the literature to-date. Our survey aims to provide an overview of research studies in this field over the last 40 years. We review TMS-EEG artifacts, their sources and their waveforms and present the state-of-the-art in EEG technologies and front-end characteristics. We also propose a synchronization toolbox for TMS-EEG laboratories. We then review subject preparation frameworks and online artifacts reduction maneuvers for improving data acquisition and conclude by outlining open challenges and future research directions in the field.
Assuntos
Artefatos , Estimulação Magnética Transcraniana , Eletroencefalografia , Potenciais Evocados , TecnologiaRESUMO
Machine learning (ML)-based algorithms are playing an important role in cancer diagnosis and are increasingly being used to aid clinical decision-making. However, these commonly operate as 'black boxes' and it is unclear how decisions are derived. Recently, techniques have been applied to help us understand how specific ML models work and explain the rational for outputs. This study aims to determine why a given type of cancer has a certain phenotypic characteristic. Cancer results in cellular dysregulation and a thorough consideration of cancer regulators is required. This would increase our understanding of the nature of the disease and help discover more effective diagnostic, prognostic, and treatment methods for a variety of cancer types and stages. Our study proposes a novel explainable analysis of potential biomarkers denoting tumorigenesis in non-small cell lung cancer. A number of these biomarkers are known to appear following various treatment pathways. An enhanced analysis is enabled through a novel mathematical formulation for the regulators of mRNA, the regulators of ncRNA, and the coupled mRNA-ncRNA regulators. Temporal gene expression profiles are approximated in a two-dimensional spatial domain for the transition states before converging to the stationary state, using a system comprised of coupled-reaction partial differential equations. Simulation experiments demonstrate that the proposed mathematical gene-expression profile represents a best fit for the population abundance of these oncogenes. In future, our proposed solution can lead to the development of alternative interpretable approaches, through the application of ML models to discover unknown dynamics in gene regulatory systems.
Assuntos
Carcinoma Pulmonar de Células não Pequenas , Neoplasias Pulmonares , Algoritmos , Difusão , Perfilação da Expressão Gênica , Humanos , Neoplasias Pulmonares/diagnóstico , Neoplasias Pulmonares/genéticaRESUMO
Until now, clinicians are not able to evaluate the Psychogenic Non-Epileptic Seizures (PNES) from the rest-electroencephalography (EEG) readout. No EEG marker can help differentiate PNES cases from healthy subjects. In this paper, we have investigated the power spectrum density (PSD), in resting-state EEGs, to evaluate the abnormalities in PNES affected brains. Additionally, we have used functional connectivity tools, such as phase lag index (PLI), and graph-derived metrics to better observe the integration of distributed information of regular and synchronized multi-scale communication within and across inter-regional brain areas. We proved the utility of our method after enrolling a cohort study of 20 age- and gender-matched PNES and 19 healthy control (HC) subjects. In this work, three classification models, namely support vector machine (SVM), linear discriminant analysis (LDA), and Multilayer perceptron (MLP), have been employed to model the relationship between the functional connectivity features (rest-HC versus rest-PNES). The best performance for the discrimination of participants was obtained using the MLP classifier, reporting a precision of 85.73%, a recall of 86.57%, an F1-score of 78.98%, and, finally, an accuracy of 91.02%. In conclusion, our results hypothesized two main aspects. The first is an intrinsic organization of functional brain networks that reflects a dysfunctional level of integration across brain regions, which can provide new insights into the pathophysiological mechanisms of PNES. The second is that functional connectivity features and MLP could be a promising method to classify rest-EEG data of PNES form healthy controls subjects.
Assuntos
Eletroencefalografia , Convulsões , Estudos de Coortes , Humanos , Aprendizado de Máquina , DescansoRESUMO
Sentiment analysis aims to automatically classify the subject's sentiment (e.g., positive, negative, or neutral) towards a particular aspect such as a topic, product, movie, news, etc. Deep learning has recently emerged as a powerful machine learning technique to tackle the growing demand for accurate sentiment analysis. However, the majority of research efforts are devoted to English-language only, while information of great importance is also available in other languages. This paper presents a novel, context-aware, deep-learning-driven, Persian sentiment analysis approach. Specifically, the proposed deep-learning-driven automated feature-engineering approach classifies Persian movie reviews as having positive or negative sentiments. Two deep learning algorithms, convolutional neural networks (CNN) and long-short-term memory (LSTM), are applied and compared with our previously proposed manual-feature-engineering-driven, SVM-based approach. Simulation results demonstrate that LSTM obtained a better performance as compared to multilayer perceptron (MLP), autoencoder, support vector machine (SVM), logistic regression and CNN algorithms.
RESUMO
Offline Arabic Handwriting Recognition (OAHR) has recently become instrumental in the areas of pattern recognition and image processing due to its application in several fields, such as office automation and document processing. However, OAHR continues to face several challenges, including high variability of the Arabic script and its intrinsic characteristics such as cursiveness, ligatures, and diacritics, the unlimited variation in human handwriting, and the lack of large public databases. In this paper, we introduce a novel context-aware model based on deep neural networks to address the challenges of recognizing offline handwritten Arabic text, including isolated digits, characters, and words. Specifically, we propose a supervised Convolutional Neural Network (CNN) model that contextually extracts optimal features and employs batch normalization and dropout regularization parameters. This aims to prevent overfitting and further enhance generalization performance when compared to conventional deep learning models. We employ a number of deep stacked-convolutional layers to design the proposed Deep CNN (DCNN) architecture. The model is extensively evaluated and shown to demonstrate excellent classification accuracy when compared to conventional OAHR approaches on a diverse set of six benchmark databases, including MADBase (Digits), CMATERDB (Digits), HACDB (Characters), SUST-ALT (Digits), SUST-ALT (Characters), and SUST-ALT (Names). A further experimental study is conducted on the benchmark Arabic databases by exploiting transfer learning (TL)-based feature extraction which demonstrates the superiority of our proposed model in relation to state-of-the-art VGGNet-19 and MobileNet pre-trained models. Finally, experiments are conducted to assess comparative generalization capabilities of the models using another language database , specifically the benchmark MNIST English isolated Digits database, which further confirm the superiority of our proposed DCNN model.
RESUMO
With the development of commodity economy, the emergence of fake and shoddy products has seriously harmed the interests of consumers and enterprises. To tackle this challenge, customized 2D barcode is proposed to satisfy the requirements of the enterprise anti-counterfeiting certification. Based on information hiding technology, the proposed approach can solve these challenging problems and provide a low-cost, difficult to forge, and easy to identify solution, while achieving the function of conventional 2D barcodes. By weighting between the perceptual quality and decoding robustness in sensing recognition, the customized 2D barcode can maintain a better aesthetic appearance for anti-counterfeiting and achieve fast encoding. A new picture-embedding scheme was designed to consider 2D barcode, within a unit image block as a basic encoding unit, where the 2D barcode finder patterns were embedded after encoding. Experimental results demonstrated that the proposed customized barcode could provide better encoding characteristics, while maintaining better decoding robustness than several state-of-the-art methods. Additionally, as a closed source 2D barcode that could be visually anti-counterfeit, the customized 2D barcode could effectively prevent counterfeiting that replicate physical labels. Benefitting from the high-security, high information capacity, and low-cost, the proposed customized 2D barcode with sensing recognition scheme provide a highly practical, valuable in terms of marketing, and anti-counterfeiting traceable solution for future smart IoT applications.
RESUMO
Extraction of relevant lip features is of continuing interest in the visual speech domain. Using end-to-end feature extraction can produce good results, but at the cost of the results being difficult for humans to comprehend and relate to. We present a new, lightweight feature extraction approach, motivated by human-centric glimpse-based psychological research into facial barcodes, and demonstrate that these simple, easy to extract 3D geometric features (produced using Gabor-based image patches), can successfully be used for speech recognition with LSTM-based machine learning. This approach can successfully extract low dimensionality lip parameters with a minimum of processing. One key difference between using these Gabor-based features and using other features such as traditional DCT, or the current fashion for CNN features is that these are human-centric features that can be visualised and analysed by humans. This means that it is easier to explain and visualise the results. They can also be used for reliable speech recognition, as demonstrated using the Grid corpus. Results for overlapping speakers using our lightweight system gave a recognition rate of over 82%, which compares well to less explainable features in the literature.