Help: Approach to video data augmentation

I am building a video data classifier and here is my approach to data augmentation. Kindly let me know the validity of this approach.

My dataset consists of videos of 25 frames each (model is designed to receive 25 frames only). I will pass all 25 frames and augment the videos by skipping alternate frames and padding the rest of the frames by (0,0,0), assuming the model is designed to accept 3 channels RGB.

I am not sure that skipping frames will give realistic training data.

You might consider modifying the video itself: for example the image framing, rotation, brightness, and color balance, so you increase the variation in the training set.