Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 3 de 3
Filter
Add more filters











Database
Language
Publication year range
1.
MethodsX ; 13: 102780, 2024 Dec.
Article in English | MEDLINE | ID: mdl-39007030

ABSTRACT

In today's world of managing multimedia content, dealing with the amount of CCTV footage poses challenges related to storage, accessibility and efficient navigation. To tackle these issues, we suggest an encompassing technique, for summarizing videos that merges machine-learning techniques with user engagement. Our methodology consists of two phases, each bringing improvements to video summarization. In Phase I we introduce a method for summarizing videos based on keyframe detection and behavioral analysis. By utilizing technologies like YOLOv5 for object recognition, Deep SORT for object tracking, and Single Shot Detector (SSD) for creating video summaries. In Phase II we present a User Interest Based Video summarization system driven by machine learning. By incorporating user preferences into the summarization process we enhance techniques with personalized content curation. Leveraging tools such as NLTK, OpenCV, TensorFlow, and the EfficientDET model enables our system to generate customized video summaries tailored to preferences. This innovative approach not only enhances user interactions but also efficiently handles the overwhelming amount of video data on digital platforms. By combining these two methodologies we make progress in applying machine learning techniques while offering a solution to the complex challenges presented by managing multimedia data.

2.
MethodsX ; 12: 102737, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38774687

ABSTRACT

In the digital age, the proliferation of health-related information online has heightened the risk of misinformation, posing substantial threats to public well-being. This research conducts a meticulous comparative analysis of classification models, focusing on detecting health misinformation. The study evaluates the performance of traditional machine learning models and advanced graph convolutional networks (GCN) across critical algorithmic metrics. The results comprehensively understand each algorithm's effectiveness in identifying health misinformation and provide valuable insights for combating the pervasive spread of false health information in the digital landscape. GCN with TF-IDF gives the best result, as shown in the result section. •The research method involves a comparative analysis of classification algorithms to detect health misinformation, exploring traditional machine learning models and graph convolutional networks.•This research used algorithms such as Passive Aggressive Classifier, Random Forest, Decision Tree, Logistic Regression, Light GBM, GCN, GCN with BERT, GCN with TF-IDF, and GCN with Word2Vec were employed. Performance Metrics: Accuracy: for Passive Aggressive Classifier: 85.75 %, Random Forest: 86 %, Decision Tree: 81.30 %, Light BGM: 83.29 %, normal GCN: 84.53 %, GCN with BERT: 85.00 %, GCN with TR-IDF: 93.86 % and GCN with word2Vec: 81.00 %•Algorithmic performance metrics, including accuracy, precision, recall, and F1-score, were systematically evaluated to assess the efficacy of each model in detecting health misinformation, focusing on understanding the strengths and limitations of different approaches. The superior performance of Graph Convolutional Networks (GCNs) with TF-IDF embedding, achieving an accuracy of 93.86.

3.
Heliyon ; 10(4): e26162, 2024 Feb 29.
Article in English | MEDLINE | ID: mdl-38420442

ABSTRACT

In recent decades, abstractive text summarization using multimodal input has attracted many researchers due to the capability of gathering information from various sources to create a concise summary. However, the existing methodologies based on multimodal summarization provide only a summary for the short videos and poor results for the lengthy videos. To address the aforementioned issues, this research presented the Multimodal Abstractive Summarization using Bidirectional Encoder Representations from Transformers (MAS-BERT) with an attention mechanism. The purpose of the video summarization is to increase the speed of searching for a large collection of videos so that the users can quickly decide whether the video is relevant or not by reading the summary. Initially, the data is obtained from the publicly available How2 dataset and is encoded using the Bidirectional Gated Recurrent Unit (Bi-GRU) encoder and the Long Short Term Memory (LSTM) encoder. The textual data which is embedded in the embedding layer is encoded using a bidirectional GRU encoder and the features with audio and video data are encoded with LSTM encoder. After this, BERT based attention mechanism is used to combine the modalities and finally, the BI-GRU based decoder is used for summarizing the multimodalities. The results obtained through the experiments that show the proposed MAS-BERT has achieved a better result of 60.2 for Rouge-1 whereas, the existing Decoder-only Multimodal Transformer (D-MmT) and the Factorized Multimodal Transformer based Decoder Only Language model (FLORAL) has achieved 49.58 and 56.89 respectively. Our work facilitates users by providing better contextual information and user experience and would help video-sharing platforms for customer retention by allowing users to search for relevant videos by looking at its summary.

SELECTION OF CITATIONS
SEARCH DETAIL