RESUMEN
The dataset contains RGB and depth version video frames of various hand movements captured with the Intel RealSense Depth Camera D435. The camera has two channels for collecting both RGB and depth frames at the same time. A large dataset is created for accurate classification of hand gestures under complex backgrounds. The dataset is made up of 29718 frames from RGB and depth versions corresponding to various hand gestures from different people collected at different time instances with complex backgrounds. Hand movements corresponding to scroll-right, scroll-left, scroll-up, scroll-down, zoom-in, and zoom-out are included in the data. Each sequence has data of 40 frames, and there is a total of 662 sequences corresponding to each gesture in the dataset. To capture all the variations in the dataset, the hand is oriented in various ways while capturing.