Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 60
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Sensors (Basel) ; 23(12)2023 Jun 17.
Artigo em Inglês | MEDLINE | ID: mdl-37420825

RESUMO

The milling machine serves an important role in manufacturing because of its versatility in machining. The cutting tool is a critical component of machining because it is responsible for machining accuracy and surface finishing, impacting industrial productivity. Monitoring the cutting tool's life is essential to avoid machining downtime caused due to tool wear. To prevent the unplanned downtime of the machine and to utilize the maximum life of the cutting tool, the accurate prediction of the remaining useful life (RUL) cutting tool is essential. Different artificial intelligence (AI) techniques estimate the RUL of cutting tools in milling operations with improved prediction accuracy. The IEEE NUAA Ideahouse dataset has been used in this paper for the RUL estimation of the milling cutter. The accuracy of the prediction is based on the quality of feature engineering performed on the unprocessed data. Feature extraction is a crucial phase in RUL prediction. In this work, the authors considers the time-frequency domain (TFD) features such as short-time Fourier-transform (STFT) and different wavelet transforms (WT) along with deep learning (DL) models such as long short-term memory (LSTM), different variants of LSTN, convolutional neural network (CNN), and hybrid models that are a combination of CCN with LSTM variants for RUL estimation. The TFD feature extraction with LSTM variants and hybrid models performs well for the milling cutting tool RUL estimation.


Assuntos
Aprendizado Profundo , Comportamento de Utilização de Ferramentas , Inteligência Artificial , Comércio , Engenharia
2.
Sensors (Basel) ; 22(5)2022 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-35271073

RESUMO

In the last decade, the proactive diagnosis of diseases with artificial intelligence and its aligned technologies has been an exciting and fruitful area. One of the areas in medical care where constant monitoring is required is cardiovascular diseases. Arrhythmia, one of the cardiovascular diseases, is generally diagnosed by doctors using Electrocardiography (ECG), which records the heart's rhythm and electrical activity. The use of neural networks has been extensively adopted to identify abnormalities in the last few years. It is found that the probability of detecting arrhythmia increases if the denoised signal is used rather than the raw input signal. This paper compares six filters implemented on ECG signals to improve classification accuracy. Custom convolutional neural networks (CCNNs) are designed to filter ECG data. Extensive experiments are drawn by considering the six ECG filters and the proposed custom CCNN models. Comparative analysis reveals that the proposed models outperform the competitive models in various performance metrics.


Assuntos
Análise de Dados , Processamento de Sinais Assistido por Computador , Inteligência Artificial , Eletrocardiografia , Redes Neurais de Computação
3.
Sensors (Basel) ; 22(2)2022 Jan 10.
Artigo em Inglês | MEDLINE | ID: mdl-35062478

RESUMO

Fused deposition modelling (FDM)-based 3D printing is a trending technology in the era of Industry 4.0 that manufactures products in layer-by-layer form. It shows remarkable benefits such as rapid prototyping, cost-effectiveness, flexibility, and a sustainable manufacturing approach. Along with such advantages, a few defects occur in FDM products during the printing stage. Diagnosing defects occurring during 3D printing is a challenging task. Proper data acquisition and monitoring systems need to be developed for effective fault diagnosis. In this paper, the authors proposed a low-cost multi-sensor data acquisition system (DAQ) for detecting various faults in 3D printed products. The data acquisition system was developed using an Arduino micro-controller that collects real-time multi-sensor signals using vibration, current, and sound sensors. The different types of fault conditions are referred to introduce various defects in 3D products to analyze the effect of the fault conditions on the captured sensor data. Time and frequency domain analyses were performed on captured data to create feature vectors by selecting the chi-square method, and the most significant features were selected to train the CNN model. The K-means cluster algorithm was used for data clustering purposes, and the bell curve or normal distribution curve was used to define individual sensor threshold values under normal conditions. The CNN model was used to classify the normal and fault condition data, which gave an accuracy of around 94%, by evaluating the model performance based on recall, precision, and F1 score.

4.
Sensors (Basel) ; 22(21)2022 Oct 26.
Artigo em Inglês | MEDLINE | ID: mdl-36365909

RESUMO

The induction motor plays a vital role in industrial drive systems due to its robustness and easy maintenance but at the same time, it suffers electrical faults, mainly rotor faults such as broken rotor bars. Early shortcoming identification is needed to lessen support expenses and hinder high costs by using failure detection frameworks that give features extraction and pattern grouping of the issue to distinguish the failure in an induction motor using classification models. In this paper, the open-source dataset of the rotor with the broken bars in a three-phase induction motor available on the IEEE data port is used for fault classification. The study aims at fault identification under various loading conditions on the rotor of an induction motor by performing time, frequency, and time-frequency domain feature extraction. The extracted features are provided to the models to classify between the healthy and faulty rotors. The extracted features from the time and frequency domain give an accuracy of up to 87.52% and 88.58%, respectively, using the Random-Forest (RF) model. Whereas, in time-frequency, the Short Time Fourier Transform (STFT) based spectrograms provide reasonably high accuracy, around 97.67%, using a Convolutional Neural Network (CNN) based fine-tuned transfer learning framework for diagnosing induction motor rotor bar severity under various loading conditions.


Assuntos
Algoritmos , Vibração , Análise de Falha de Equipamento , Simulação por Computador , Aprendizado de Máquina
5.
Sensors (Basel) ; 22(20)2022 Oct 21.
Artigo em Inglês | MEDLINE | ID: mdl-36298415

RESUMO

Human ideas and sentiments are mirrored in facial expressions. They give the spectator a plethora of social cues, such as the viewer's focus of attention, intention, motivation, and mood, which can help develop better interactive solutions in online platforms. This could be helpful for children while teaching them, which could help in cultivating a better interactive connect between teachers and students, since there is an increasing trend toward the online education platform due to the COVID-19 pandemic. To solve this, the authors proposed kids' emotion recognition based on visual cues in this research with a justified reasoning model of explainable AI. The authors used two datasets to work on this problem; the first is the LIRIS Children Spontaneous Facial Expression Video Database, and the second is an author-created novel dataset of emotions displayed by children aged 7 to 10. The authors identified that the LIRIS dataset has achieved only 75% accuracy, and no study has worked further on this dataset in which the authors have achieved the highest accuracy of 89.31% and, in the authors' dataset, an accuracy of 90.98%. The authors also realized that the face construction of children and adults is different, and the way children show emotions is very different and does not always follow the same way of facial expression for a specific emotion as compared with adults. Hence, the authors used 3D 468 landmark points and created two separate versions of the dataset from the original selected datasets, which are LIRIS-Mesh and Authors-Mesh. In total, all four types of datasets were used, namely LIRIS, the authors' dataset, LIRIS-Mesh, and Authors-Mesh, and a comparative analysis was performed by using seven different CNN models. The authors not only compared all dataset types used on different CNN models but also explained for every type of CNN used on every specific dataset type how test images are perceived by the deep-learning models by using explainable artificial intelligence (XAI), which helps in localizing features contributing to particular emotions. The authors used three methods of XAI, namely Grad-CAM, Grad-CAM++, and SoftGrad, which help users further establish the appropriate reason for emotion detection by knowing the contribution of its features in it.


Assuntos
COVID-19 , Aprendizado Profundo , Adulto , Criança , Animais , Humanos , Inteligência Artificial , Pandemias , Emoções
6.
Appl Soft Comput ; 123: 108973, 2022 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-35572359

RESUMO

COVID-19 is a highly contagious disease that has infected over 136 million people worldwide with over 2.9 million deaths as of 11 April 2021. In March 2020, the WHO declared COVID-19 as a pandemic and countries began to implement measures to control the spread of the virus. The spread and the death rates of the virus displayed dramatic differences among countries globally, showing that there are several factors affecting its spread and mortality. By utilizing the cumulative number of cases from John Hopkins University, the recovery rate, death rate, and the number of active, recovered, and death cases were simulated to analyse the trends and patterns within the chosen countries. 10 countries from 3 different case severity categories (high cases, medium cases, and low cases) and 5 continents (Asia, North America, South America, Europe, and Oceania) were studied. A generalized SEIR model which considers control measures such as isolation, and preventive measures such as vaccination is applied in this study. This model is able to capture not only the dynamics between the states, but also the time evolution of the states by using the fourth-order-Runge-Kutta process. This study found no significant patterns in the countries under the same case severity category, suggesting that there are other factors contributing to the pattern in these countries. One of the factors influencing the pattern in each country is the population's age. COVID-19 related deaths were found to be notably higher among older people, indicating that countries comprising of a larger proportion of older age groups have an increased risk of experiencing higher death rates. Tighter governmental control measures led to fewer infections and eventually reduced the number of death cases, while increasing the recovery rate, and early implementations were found to be far more effective in controlling the spread of the virus and produced better outcomes.

7.
Sensors (Basel) ; 21(6)2021 Mar 10.
Artigo em Inglês | MEDLINE | ID: mdl-33801865

RESUMO

Attitude estimation is the process of computing the orientation angles of an object with respect to a fixed frame of reference. Gyroscope, accelerometer, and magnetometer are some of the fundamental sensors used in attitude estimation. The orientation angles computed from these sensors are combined using the sensor fusion methodologies to obtain accurate estimates. The complementary filter is one of the widely adopted techniques whose performance is highly dependent on the appropriate selection of its gain parameters. This paper presents a novel cascaded architecture of the complementary filter that employs a nonlinear and linear version of the complementary filter within one framework. The nonlinear version is used to correct the gyroscope bias, while the linear version estimates the attitude angle. The significant advantage of the proposed architecture is its independence of the filter parameters, thereby avoiding tuning the filter's gain parameters. The proposed architecture does not require any mathematical modeling of the system and is computationally inexpensive. The proposed methodology is applied to the real-world datasets, and the estimation results were found to be promising compared to the other state-of-the-art algorithms.

8.
Sensors (Basel) ; 21(23)2021 Nov 25.
Artigo em Inglês | MEDLINE | ID: mdl-34883839

RESUMO

A reasonably good network intrusion detection system generally requires a high detection rate and a low false alarm rate in order to predict anomalies more accurately. Older datasets cannot capture the schema of a set of modern attacks; therefore, modelling based on these datasets lacked sufficient generalizability. This paper operates on the UNSW-NB15 Dataset, which is currently one of the best representatives of modern attacks and suggests various models. We discuss various models and conclude our discussion with the model that performs the best using various kinds of evaluation metrics. Alongside modelling, a comprehensive data analysis on the features of the dataset itself using our understanding of correlation, variance, and similar factors for a wider picture is done for better modelling. Furthermore, hypothetical ponderings are discussed for potential network intrusion detection systems, including suggestions on prospective modelling and dataset generation as well.


Assuntos
Segurança Computacional , Análise de Dados , Benchmarking , Estudos Prospectivos , Registros
9.
Sensors (Basel) ; 21(23)2021 Dec 02.
Artigo em Inglês | MEDLINE | ID: mdl-34884075

RESUMO

Distributed denial-of-service (DDoS) attacks are significant threats to the cyber world because of their potential to quickly bring down victims. Memcached vulnerabilities have been targeted by attackers using DDoS amplification attacks. GitHub and Arbor Networks were the victims of Memcached DDoS attacks with 1.3 Tbps and 1.8 Tbps attack strengths, respectively. The bandwidth amplification factor of nearly 50,000 makes Memcached the deadliest DDoS attack vector to date. In recent times, fellow researchers have made specific efforts to analyze and evaluate Memcached vulnerabilities; however, the solutions provided for security are based on best practices by users and service providers. This study is the first attempt at modifying the architecture of Memcached servers in the context of improving security against DDoS attacks. This study discusses the Memcached protocol, the vulnerabilities associated with it, the future challenges for different IoT applications associated with caches, and the solutions for detecting Memcached DDoS attacks. The proposed solution is a novel identification-pattern mechanism using a threshold scheme for detecting volume-based DDoS attacks. In the undertaken study, the solution acts as a pre-emptive measure for detecting DDoS attacks while maintaining low latency and high throughput.


Assuntos
Segurança Computacional , Previsões
10.
J Digit Imaging ; 34(4): 932-947, 2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-34240273

RESUMO

Retinopathy of prematurity (ROP) is a potentially blinding disorder seen in low birth weight preterm infants. In India, the burden of ROP is high, with nearly 200,000 premature infants at risk. Early detection through screening and treatment can prevent this blindness. The automatic screening systems developed so far can detect "severe ROP" or "plus disease," but this information does not help schedule follow-up. Identifying vascularized retinal zones and detecting the ROP stage is essential for follow-up or discharge from screening. There is no automatic system to assist these crucial decisions to the best of the authors' knowledge. The low contrast of images, incompletely developed vessels, macular structure, and lack of public data sets are a few challenges in creating such a system. In this paper, a novel method using an ensemble of "U-Network" and "Circle Hough Transform" is developed to detect zones I, II, and III from retinal images in which macula is not developed. The model developed is generic and trained on mixed images of different sizes. It detects zones in images of variable sizes captured by two different imaging systems with an accuracy of 98%. All images of the test set (including the low-quality images) are considered. The time taken for training was only 14 min, and a single image was tested in 30 ms. The present study can help medical experts interpret retinal vascular status correctly and reduce subjective variation in diagnosis.


Assuntos
Aprendizado Profundo , Retinopatia da Prematuridade , Humanos , Recém-Nascido de Baixo Peso , Recém-Nascido , Recém-Nascido Prematuro , Retina/diagnóstico por imagem , Retinopatia da Prematuridade/diagnóstico por imagem
11.
Sensors (Basel) ; 20(18)2020 Sep 22.
Artigo em Inglês | MEDLINE | ID: mdl-32972037

RESUMO

Air pollution has been a looming issue of the 21st century that has also significantly impacted the surrounding environment and societal health. Recently, previous studies have conducted extensive research on air pollution and air quality monitoring. Despite this, the fields of air pollution and air quality monitoring remain plagued with unsolved problems. In this study, the Pollution Weather Prediction System (PWP) is proposed to perform air pollution prediction for outdoor sites for various pollution parameters. In the presented research work, we introduced a PWP system configured with pollution-sensing units, such as SDS021, MQ07-CO, NO2-B43F, and Aeroqual Ozone (O3). These sensing units were utilized to collect and measure various pollutant levels, such as PM2.5, PM10, CO, NO2, and O3, for 90 days at Symbiosis International University, Pune, Maharashtra, India. The data collection was carried out between the duration of December 2019 to February 2020 during the winter. The investigation results validate the success of the presented PWP system. In the conducted experiments, linear regression and artificial neural network (ANN)-based AQI (air quality index) predictions were performed. Furthermore, the presented study also found that the customized linear regression methodology outperformed other machine-learning methods, such as linear, ridge, Lasso, Bayes, Huber, Lars, Lasso-lars, stochastic gradient descent (SGD), and ElasticNet regression methodologies, and the customized ANN regression methodology used in the conducted experiments. The overall AQI values of the air pollutants were calculated based on the summation of the AQI values of all the presented air pollutants. In the end, the web and mobile interfaces were developed to display air pollution prediction values of a variety of air pollutants.


Assuntos
Poluição do Ar , Monitoramento Ambiental , Tempo (Meteorologia) , Poluição do Ar/análise , Teorema de Bayes , Humanos , Índia , Material Particulado/análise
12.
MethodsX ; 12: 102654, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38510932

RESUMO

Handwritten text recognition (HTR) within computer vision and image processing stands as a prominent and challenging research domain, holding significant implications for diverse applications. Among these, it finds usefulness in reading bank checks, prescriptions, and deciphering characters on various forms. Optical character recognition (OCR) technology, specifically tailored for handwritten documents, plays a pivotal role in translating characters from a range of file formats, encompassing both word and image documents. Challenges in HTR encompass intricate layout designs, varied handwriting styles, limited datasets, and less accuracy achieved. Recent advancements in Deep Learning and Machine Learning algorithms, coupled with the vast repositories of unprocessed data, have propelled researchers to achieve remarkable progress in HTR. This paper aims to address the challenges in handwritten text recognition by proposing a hybrid approach. The primary objective is to enhance the accuracy of recognizing handwritten text from images. Through the integration of Convolutional Neural Networks (CNN) and Bidirectional Long Short-Term Memory (BiLSTM) with a Connectionist Temporal Classification (CTC) decoder, the results indicate substantial improvement. The proposed hybrid model achieved an impressive 98.50% and 98.80% accuracy on the IAM and RIMES datasets, respectively. This underscores the potential and efficacy of the consecutive use of these advanced neural network architectures in enhancing handwritten text recognition accuracy. •The proposed method introduces a hybrid approach for handwritten text recognition, employing CNN and BiLSTM with CTC decoder.•Results showcase a remarkable accuracy improvement of 98.50% and 98.80% on IAM and RIMES datasets, emphasizing the potential of this model for enhanced accuracy in recognizing handwritten text from images.

13.
MethodsX ; 12: 102554, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38292314

RESUMO

Digitization created a demand for highly efficient handwritten document recognition systems. A handwritten document consists of digits, text, symbols, diagrams, etc. Digits are an essential element of handwritten documents. Accurate recognition of handwritten digits is vital for effective communication and data analysis. Various researchers have attempted to address this issue with modern convolutional neural network (CNN) techniques. Even after training, CNN filter weights remain unchanged despite the high identification accuracy. As a result, the process cannot flexibly adapt to input changes. Hence computer vision researchers have recently become interested in Vision Transformers (ViTs) and Multilayer Perceptrons (MLPs). The shortcomings of CNNs gave rise to a hybrid model revolution that combines the best elements of the two fields. This paper analyzes how the hybrid convolutional ViT model affects the ability to recognize handwritten digits. Also, the real-time data contains noise, distortions, and varying writing styles. Hence, cleaned and uncleaned handwritten digit images are used for evaluation in this paper. The accuracy of the proposed method is compared with the state-of-the-art techniques, and the result shows that the proposed model achieves the highest recognition accuracy. Also, the probable solutions for recognizing other aspects of handwritten documents are discussed in this paper.•Analyzed the effect of convolutional vision transformer on cleaned and real-time handwritten digit images.•The model's performance improved with the implication of cross-validation and hyper-parameter tuning.•The results show that the proposed model is robust, feasible, and effective on cleaned and uncleaned handwritten digits.

14.
MethodsX ; 12: 102747, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38774685

RESUMO

The Internet of Things (IoT) has radically reformed various sectors and industries, enabling unprecedented levels of connectivity and automation. However, the surge in the number of IoT devices has also widened the attack surface, rendering IoT networks potentially susceptible to a plethora of security risks. Addressing the critical challenge of enhancing security in IoT networks is of utmost importance. Moreover, there is a considerable lack of datasets designed exclusively for IoT applications. To bridge this gap, a customized dataset that accurately mimics real-world IoT scenarios impacted by four different types of attacks-blackhole, sinkhole, flooding, and version number attacks was generated using the Contiki-OS Cooja Simulator in this study. The resulting dataset is then consequently employed to evaluate the efficacy of several metaheuristic algorithms, in conjunction with Convolutional Neural Network (CNN) for IoT networks. •The proposed study's goal is to identify optimal hyperparameters for CNNs, ensuring their peak performance in intrusion detection tasks.•This study not only intensifies our comprehension of IoT network security but also provides practical guidance for implementation of the robust security measures in real-world IoT applications.

15.
MethodsX ; 12: 102737, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38774687

RESUMO

In the digital age, the proliferation of health-related information online has heightened the risk of misinformation, posing substantial threats to public well-being. This research conducts a meticulous comparative analysis of classification models, focusing on detecting health misinformation. The study evaluates the performance of traditional machine learning models and advanced graph convolutional networks (GCN) across critical algorithmic metrics. The results comprehensively understand each algorithm's effectiveness in identifying health misinformation and provide valuable insights for combating the pervasive spread of false health information in the digital landscape. GCN with TF-IDF gives the best result, as shown in the result section. •The research method involves a comparative analysis of classification algorithms to detect health misinformation, exploring traditional machine learning models and graph convolutional networks.•This research used algorithms such as Passive Aggressive Classifier, Random Forest, Decision Tree, Logistic Regression, Light GBM, GCN, GCN with BERT, GCN with TF-IDF, and GCN with Word2Vec were employed. Performance Metrics: Accuracy: for Passive Aggressive Classifier: 85.75 %, Random Forest: 86 %, Decision Tree: 81.30 %, Light BGM: 83.29 %, normal GCN: 84.53 %, GCN with BERT: 85.00 %, GCN with TR-IDF: 93.86 % and GCN with word2Vec: 81.00 %•Algorithmic performance metrics, including accuracy, precision, recall, and F1-score, were systematically evaluated to assess the efficacy of each model in detecting health misinformation, focusing on understanding the strengths and limitations of different approaches. The superior performance of Graph Convolutional Networks (GCNs) with TF-IDF embedding, achieving an accuracy of 93.86.

16.
PeerJ Comput Sci ; 10: e1769, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38686011

RESUMO

Object detection methods based on deep learning have been used in a variety of sectors including banking, healthcare, e-governance, and academia. In recent years, there has been a lot of attention paid to research endeavors made towards text detection and recognition from different scenesor images of unstructured document processing. The article's novelty lies in the detailed discussion and implementation of the various transfer learning-based different backbone architectures for printed text recognition. In this research article, the authors compared the ResNet50, ResNet50V2, ResNet152V2, Inception, Xception, and VGG19 backbone architectures with preprocessing techniques as data resizing, normalization, and noise removal on a standard OCR Kaggle dataset. Further, the top three backbone architectures selected based on the accuracy achieved and then hyper parameter tunning has been performed to achieve more accurate results. Xception performed well compared with the ResNet, Inception, VGG19, MobileNet architectures by achieving high evaluation scores with accuracy (98.90%) and min loss (0.19). As per existing research in this domain, until now, transfer learning-based backbone architectures that have been used on printed or handwritten data recognition are not well represented in literature. We split the total dataset into 80 percent for training and 20 percent for testing purpose and then into different backbone architecture models with the same number of epochs, and found that the Xception architecture achieved higher accuracy than the others. In addition, the ResNet50V2 model gave us higher accuracy (96.92%) than the ResNet152V2 model (96.34%).

17.
MethodsX ; 12: 102555, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38292312

RESUMO

A rolling bearing is a crucial element within rotating machinery, and its smooth operation profoundly influences the overall well-being of the equipment. Consequently, analyzing its operational condition is crucial to prevent production losses or, in extreme cases, potential fatalities due to catastrophic failures. Accurate estimates of the Remaining Useful Life (RUL) of rolling bearings ensure manufacturing safety while also leading to cost savings.•This paper proposes an intelligent deep learning-based framework for remaining useful life estimation of bearings on the basis of informed detection of anomalies.•The paper demonstrates the setup of an experimental bearing test rig and the collection of bearing condition monitoring data such as vibration data.•Advanced hybrid models of Encoder-Decoder LSTM demonstrate high forecasting accuracy in RUL estimation.

18.
Interdiscip Sci ; 16(1): 16-38, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-37962777

RESUMO

As one of the most common female cancers, cervical cancer often develops years after a prolonged and reversible pre-cancerous stage. Traditional classification algorithms used for detection of cervical cancer often require cell segmentation and feature extraction techniques, while convolutional neural network (CNN) models demand a large dataset to mitigate over-fitting and poor generalization problems. To this end, this study aims to develop deep learning models for automated cervical cancer detection that do not rely on segmentation methods or custom features. Due to limited data availability, transfer learning was employed with pre-trained CNN models to directly operate on Pap smear images for a seven-class classification task. Thorough evaluation and comparison of 13 pre-trained deep CNN models were performed using the publicly available Herlev dataset and the Keras package in Google Collaboratory. In terms of accuracy and performance, DenseNet-201 is the best-performing model. The pre-trained CNN models studied in this paper produced good experimental results and required little computing time.


Assuntos
Teste de Papanicolaou , Neoplasias do Colo do Útero , Feminino , Humanos , Teste de Papanicolaou/métodos , Neoplasias do Colo do Útero/diagnóstico por imagem , Redes Neurais de Computação , Algoritmos , Interpretação de Imagem Assistida por Computador/métodos
19.
Data Brief ; 52: 110033, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38299103

RESUMO

This article presents a Multimodal database consisting of 222 images of 76 people wherein 111 are OCTA images and 111 are color fundus images taken at the Natasha Eye Care and Research Institute of Pune Maharashtra, India. Nonmydriatic fundus images were acquired using a confocal SLO widefield fundus imaging Eidon machine. Nonmydriatic OCTA images were acquired using the Optovue Avanti Edition machine Initially, the clinical approach described in this article was used to obtain the retinal images. Following that, the dataset was categorized by two experienced eye specialists. To identify instances of Non-Proliferative Diabetic Retinopathy (NPDR) with their various stages, medical professionals and scholars can use this data. Research scholars and ophthalmologists can utilize the data created to develop the initial stages of automated identification techniques for diabetic retinopathy (DR).

20.
Heliyon ; 10(4): e26162, 2024 Feb 29.
Artigo em Inglês | MEDLINE | ID: mdl-38420442

RESUMO

In recent decades, abstractive text summarization using multimodal input has attracted many researchers due to the capability of gathering information from various sources to create a concise summary. However, the existing methodologies based on multimodal summarization provide only a summary for the short videos and poor results for the lengthy videos. To address the aforementioned issues, this research presented the Multimodal Abstractive Summarization using Bidirectional Encoder Representations from Transformers (MAS-BERT) with an attention mechanism. The purpose of the video summarization is to increase the speed of searching for a large collection of videos so that the users can quickly decide whether the video is relevant or not by reading the summary. Initially, the data is obtained from the publicly available How2 dataset and is encoded using the Bidirectional Gated Recurrent Unit (Bi-GRU) encoder and the Long Short Term Memory (LSTM) encoder. The textual data which is embedded in the embedding layer is encoded using a bidirectional GRU encoder and the features with audio and video data are encoded with LSTM encoder. After this, BERT based attention mechanism is used to combine the modalities and finally, the BI-GRU based decoder is used for summarizing the multimodalities. The results obtained through the experiments that show the proposed MAS-BERT has achieved a better result of 60.2 for Rouge-1 whereas, the existing Decoder-only Multimodal Transformer (D-MmT) and the Factorized Multimodal Transformer based Decoder Only Language model (FLORAL) has achieved 49.58 and 56.89 respectively. Our work facilitates users by providing better contextual information and user experience and would help video-sharing platforms for customer retention by allowing users to search for relevant videos by looking at its summary.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA