Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
1.
Behav Res Methods ; 2023 Dec 11.
Artículo en Inglés | MEDLINE | ID: mdl-38082115

RESUMEN

Driving requires vision, yet there is little empirical data about how vision and cognition support safe driving. It is difficult to study perception during natural driving because the experimental rigor required would be dangerous and unethical to implement on the road. The driving environment is complex, dynamic, and immensely variable, making it extremely challenging to accurately replicate in simulation. Our proposed solution is to study vision using stimuli which reflect this inherent complexity by using footage of real driving situations. To this end, we curated a set of 750 crowd-sourced video clips (434 hazard and 316 no-hazard clips), which have been spatially, temporally, and categorically annotated. These annotations describe where the hazard appears, what it is, and when it occurs. In addition, perceived dangerousness changes from moment to moment and is not a simple binary detection judgement. To capture this more granular aspect of our stimuli, we asked 48 observers to rate the perceived hazardousness of 1356 brief video clips taken from these 750 source clips on a continuous scale. These ratings span the entire scale, have high interrater agreement, and are robust to driving history. This novel stimulus set is not only useful for understanding drivers' ability to detect hazards, but is also a tool for studying dynamic scene perception and other aspects of visual function. While this stimulus set was originally designed for behavioral studies, researchers interested in other areas such as traffic safety or computer vision may also find this dataset a useful resource.

2.
Sheng Wu Yi Xue Gong Cheng Xue Za Zhi ; 40(4): 812-819, 2023 Aug 25.
Artículo en Zh | MEDLINE | ID: mdl-37666774

RESUMEN

Autism spectrum disorder (ASD) is a neurodevelopmental disorder characterized by deficits in social communication and repetitive behaviors. With the rapid development of computer vision, visual behavior analysis aided diagnosis of ASD has got more and more attention. This paper reviews the research on visual behavior analysis aided diagnosis of ASD. First, the core symptoms and clinical diagnostic criteria of ASD are introduced briefly. Secondly, according to clinical diagnostic criteria, the interaction scenes are classified and introduced. Then, the existing relevant datasets are discussed. Finally, we analyze and compare the advantages and disadvantages of visual behavior analysis aided diagnosis methods for ASD in different interactive scenarios. The challenges in this research field are summarized and the prospects of related research are presented to promote the clinical application of visual behavior analysis in ASD diagnosis.


Asunto(s)
Trastorno del Espectro Autista , Conducta , Visión Ocular , Humanos , Trastorno del Espectro Autista/diagnóstico
3.
Surg Endosc ; 36(2): 1143-1151, 2022 02.
Artículo en Inglés | MEDLINE | ID: mdl-33825016

RESUMEN

BACKGROUND: Dividing a surgical procedure into a sequence of identifiable and meaningful steps facilitates intraoperative video data acquisition and storage. These efforts are especially valuable for technically challenging procedures that require intraoperative video analysis, such as transanal total mesorectal excision (TaTME); however, manual video indexing is time-consuming. Thus, in this study, we constructed an annotated video dataset for TaTME with surgical step information and evaluated the performance of a deep learning model in recognizing the surgical steps in TaTME. METHODS: This was a single-institutional retrospective feasibility study. All TaTME intraoperative videos were divided into frames. Each frame was manually annotated as one of the following major steps: (1) purse-string closure; (2) full thickness transection of the rectal wall; (3) down-to-up dissection; (4) dissection after rendezvous; and (5) purse-string suture for stapled anastomosis. Steps 3 and 4 were each further classified into four sub-steps, specifically, for dissection of the anterior, posterior, right, and left planes. A convolutional neural network-based deep learning model, Xception, was utilized for the surgical step classification task. RESULTS: Our dataset containing 50 TaTME videos was randomly divided into two subsets for training and testing with 40 and 10 videos, respectively. The overall accuracy obtained for all classification steps was 93.2%. By contrast, when sub-step classification was included in the performance analysis, a mean accuracy (± standard deviation) of 78% (± 5%), with a maximum accuracy of 85%, was obtained. CONCLUSIONS: To the best of our knowledge, this is the first study based on automatic surgical step classification for TaTME. Our deep learning model self-learned and recognized the classification steps in TaTME videos with high accuracy after training. Thus, our model can be applied to a system for intraoperative guidance or for postoperative video indexing and analysis in TaTME procedures.


Asunto(s)
Aprendizaje Profundo , Laparoscopía , Proctectomía , Neoplasias del Recto , Cirugía Endoscópica Transanal , Humanos , Laparoscopía/métodos , Complicaciones Posoperatorias/cirugía , Proctectomía/educación , Neoplasias del Recto/cirugía , Recto/cirugía , Estudios Retrospectivos , Cirugía Endoscópica Transanal/métodos
4.
Displays ; 73: 102235, 2022 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-35574253

RESUMEN

The COVID-19 outbreak has extenuated the need for a monitoring system that can monitor face mask adherence and social distancing with the use of AI. With the existing video surveillance systems as base, a deep learning model is proposed for mask detection and social distance measurement. State-of-the-art object detection and recognition models such as Mask RCNN, YOLOv4, YOLOv5, and YOLOR were trained for mask detection and evaluated on the existing datasets and on a newly proposed video mask detection dataset the ViDMASK. The obtained results achieved a comparatively high mean average precision of 92.4% for YOLOR. After mask detection, the distance between people's faces is measured for high risk and low risk distance. Furthermore, the new large-scale mask dataset from videos named ViDMASK diversifies the subjects in terms of pose, environment, quality of image, and versatile subject characteristics, producing a challenging dataset. The tested models succeed in detecting the face masks with high performance on the existing dataset, MOXA. However, with the VIDMASK dataset, the performance of most models are less accurate because of the complexity of the dataset and the number of people in each scene. The link to ViDMask dataset and the base codes are available at https://github.com/ViDMask/VidMask-code.git.

5.
Sensors (Basel) ; 20(8)2020 Apr 22.
Artículo en Inglés | MEDLINE | ID: mdl-32331463

RESUMEN

The statistical data of different kinds of behaviors of pigs can reflect their health status. However, the traditional behavior statistics of pigs were obtained and then recorded from the videos through human eyes. In order to reduce labor and time consumption, this paper proposed a pig behavior recognition network with a spatiotemporal convolutional network based on the SlowFast network architecture for behavior classification of five categories. Firstly, a pig behavior recognition video dataset (PBVD-5) was built by cutting short clips from 3-month non-stop shooting videos, which was composed of five categories of pig's behavior: feeding, lying, motoring, scratching and mounting. Subsequently, a SlowFast network based spatiotemporal convolutional network for the pig's multi-behavior recognition (PMB-SCN) was proposed. The results of the networks with variant architectures of the PMB-SCN were implemented and the optimal architecture was compared with the state-of-the-art single stream 3D convolutional network in our dataset. Our 3D pig behavior recognition network showed a top-1 accuracy of 97.63% and a views accuracy of 96.35% on the test set of PBVD and a top-1 accuracy of 91.87% and a views accuracy of 84.47% on a new test set collected from a completely different pigsty. The experimental results showed that this network provided remarkable ability of generalization and possibility for the subsequent pig detection and behavior recognition simultaneously.


Asunto(s)
Redes Neurales de la Computación , Algoritmos , Animales , Aprendizaje Profundo , Aprendizaje Automático , Reconocimiento de Normas Patrones Automatizadas , Porcinos
6.
Data Brief ; 53: 110189, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38389956

RESUMEN

In the convergence of cultural heritage preservation and computational research, this paper presents a comprehensive dataset capturing the nuanced artistry of traditional Balinese dancing. This endeavour not only documents but also aims to protect these intricate dance movements, bridging technological advancements with rich cultural traditions. The dataset includes 1740 and 828 high-definition video recordings of female and male dancers, respectively, showcasing 24 unique dance movements. These recordings, made using multiple smartphone cameras from various perspectives, reflect the dance's complex nature and emphasize gender-based movement variations, key for detailed cultural analysis. Our systematic approach involved three strategically placed cameras, capturing diverse angles for a holistic view. The subsequent preprocessing stage, including segmenting and labelling, has enhanced the dataset's clarity, making it a valuable resource for cultural studies and computational analysis in preserving intangible cultural heritage. The videos, stored in MP4 format, are categorized by dancer gender, dance type, and camera angle, offering researchers a rich, multifaceted tool for exploring this traditional art form.

7.
Data Brief ; 57: 110892, 2024 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-39309713

RESUMEN

The population of older adults (elders) is increasing at a breakneck pace worldwide. This surge presents a significant challenge in providing adequate care for elders due to the scarcity of human caregivers. Unintentional falls of humans are critical health issues, especially for elders. Detecting falls and providing assistance as early as possible is of utmost importance. Researchers worldwide have shown interest in designing a system to detect falls promptly especially by remote monitoring, enabling the timely provision of medical help. The dataset 'GMDCSA-24' has been created to support the researchers on this topic to develop models to detect falls and other activities. This dataset was generated in three different natural home setups, where Falls and Activities of Daily Living were performed by four subjects (actors). To bring the versatility, the recordings were done at different times and lighting conditions: during the day when there is ample light and at night when there is low light in addition, the subjects wear different sets of clothes in the dataset. The actions were captured using the low-cost 0.92 Megapixel webcam. The low-resolution video clips make it suitable for use in real-time systems with fewer resources without any compression or processing of the clips. Users can also use this dataset to check the robustness and generalizability of a system for false positives since many ADL clips involve complex activities that may be falsely detected as falls. These complex activities include sleeping, picking up an object from the ground, doing push-ups, etc. The dataset contains 81 falls and 79 ADL video clips performed by four subjects.

8.
Micromachines (Basel) ; 13(1)2021 Dec 31.
Artículo en Inglés | MEDLINE | ID: mdl-35056238

RESUMEN

Video object and human action detection are applied in many fields, such as video surveillance, face recognition, etc. Video object detection includes object classification and object location within the frame. Human action recognition is the detection of human actions. Usually, video detection is more challenging than image detection, since video frames are often more blurry than images. Moreover, video detection often has other difficulties, such as video defocus, motion blur, part occlusion, etc. Nowadays, the video detection technology is able to implement real-time detection, or high-accurate detection of blurry video frames. In this paper, various video object and human action detection approaches are reviewed and discussed, many of them have performed state-of-the-art results. We mainly review and discuss the classic video detection methods with supervised learning. In addition, the frequently-used video object detection and human action recognition datasets are reviewed. Finally, a summarization of the video detection is represented, e.g., the video object and human action detection methods could be classified into frame-by-frame (frame-based) detection, extracting-key-frame detection and using-temporal-information detection; the methods of utilizing temporal information of adjacent video frames are mainly the optical flow method, Long Short-Term Memory and convolution among adjacent frames.

9.
J Imaging ; 7(8)2021 Aug 21.
Artículo en Inglés | MEDLINE | ID: mdl-34460794

RESUMEN

Omnidirectional (or 360°) cameras are acquisition devices that, in the next few years, could have a big impact on video surveillance applications, research, and industry, as they can record a spherical view of a whole environment from every perspective. This paper presents two new contributions to the research community: the CVIP360 dataset, an annotated dataset of 360° videos for distancing applications, and a new method to estimate the distances of objects in a scene from a single 360° image. The CVIP360 dataset includes 16 videos acquired outdoors and indoors, annotated by adding information about the pedestrians in the scene (bounding boxes) and the distances to the camera of some points in the 3D world by using markers at fixed and known intervals. The proposed distance estimation algorithm is based on geometry facts regarding the acquisition process of the omnidirectional device, and is uncalibrated in practice: the only required parameter is the camera height. The proposed algorithm was tested on the CVIP360 dataset, and empirical results demonstrate that the estimation error is negligible for distancing applications.

10.
PeerJ Comput Sci ; 6: e305, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-33816956

RESUMEN

We propose a new visualization method for massive supercomputer simulations. The key idea is to scatter multiple omnidirectional cameras to record the simulation via in situ visualization. After the simulations are complete, researchers can interactively explore the data collection of the recorded videos by navigating along a path in four-dimensional spacetime. We demonstrate the feasibility of this method by applying it to three different fluid and magnetohydrodynamics simulations using up to 1,000 omnidirectional cameras.

11.
Healthc Technol Lett ; 6(6): 237-242, 2019 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-32038864

RESUMEN

This Letter presents a stable polyp-scene classification method with low false positive (FP) detection. Precise automated polyp detection during colonoscopies is essential for preventing colon-cancer deaths. There is, therefore, a demand for a computer-assisted diagnosis (CAD) system for colonoscopies to assist colonoscopists. A high-performance CAD system with spatiotemporal feature extraction via a three-dimensional convolutional neural network (3D CNN) with a limited dataset achieved about 80% detection accuracy in actual colonoscopic videos. Consequently, further improvement of a 3D CNN with larger training data is feasible. However, the ratio between polyp and non-polyp scenes is quite imbalanced in a large colonoscopic video dataset. This imbalance leads to unstable polyp detection. To circumvent this, the authors propose an efficient and balanced learning technique for deep residual learning. The authors' method randomly selects a subset of non-polyp scenes whose number is the same number of still images of polyp scenes at the beginning of each epoch of learning. Furthermore, they introduce post-processing for stable polyp-scene classification. This post-processing reduces the FPs that occur in the practical application of polyp-scene classification. They evaluate several residual networks with a large polyp-detection dataset consisting of 1027 colonoscopic videos. In the scene-level evaluation, their proposed method achieves stable polyp-scene classification with 0.86 sensitivity and 0.97 specificity.

12.
Artículo en Zh | WPRIM | ID: wpr-1008904

RESUMEN

Autism spectrum disorder (ASD) is a neurodevelopmental disorder characterized by deficits in social communication and repetitive behaviors. With the rapid development of computer vision, visual behavior analysis aided diagnosis of ASD has got more and more attention. This paper reviews the research on visual behavior analysis aided diagnosis of ASD. First, the core symptoms and clinical diagnostic criteria of ASD are introduced briefly. Secondly, according to clinical diagnostic criteria, the interaction scenes are classified and introduced. Then, the existing relevant datasets are discussed. Finally, we analyze and compare the advantages and disadvantages of visual behavior analysis aided diagnosis methods for ASD in different interactive scenarios. The challenges in this research field are summarized and the prospects of related research are presented to promote the clinical application of visual behavior analysis in ASD diagnosis.


Asunto(s)
Humanos , Trastorno del Espectro Autista/diagnóstico , Visión Ocular , Conducta
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA