Hi there, I am trying to mix two video datasets to train my model but they have different mean and standard deviation values. From my previous experiments, I know the min-max scaling yields sub-optimal results so I want to apply Z-Score normalization but I don’t know how to determine the mean and std values. I couldn’t find the best practice on the internet, could you help me or share related resources? I appreciate your help, thank you in advance
Are you certain that normalzing the video data is going to be useful? I don’t have direct experience with this, but it seems to me that the data might already be sufficiently normalized through the video editing process.
Hi, thanks for the answer. Do you mean only applying Min-Max scaling or applying no scaling at all(values between 0-255)? The Data has a normal distribution but it is not a standard distribution. For more context, my model takes a video sample and predicts a scalar value.
By the way, pre-trained models on torchvision also apply normalization to the video data as a preprocessing step (Link).
If you know the max value is 255, then divide everything by 255 and it should be fine.
You don’t really need a mean value of 0.
Yes, it is a choice but it gives worse results compared to normalization with the training split statistics in the single dataset setting. I guess I need to experiment with different settings. Thanks
When you change the normalization, you might also need to change the layer weight initialization method.
Also and I am thinking now, you say that normalization has been applied, if you find out what normalization has been applied to each then reverse for one dataset and make it similar to the other one!
I use pre-trained weights to start but I can test it out too with random weights. I didn’t know it might be necessary, thanks
Z-score normalization is applied and I have the statistics for both of the datasets. If I normalize every dataset individually, I will have apparently good results but afterward, my model will receive samples that come from various distributions that I can’t decide the statistics beforehand. Maybe, I can train the model as you said and normalize the validation split sample-wise to see the real performance.
You can always build a pipeline of code to normalize the incoming data in real time, according to your models demands and this is actually done in Machine Learning models in production.
So my idea is, first you find out how they are normalized, then you denormalize if needed and then unify the normalization process for all incoming channels. This is theoretical anyways.
Combine the 2 datasets and normalize using the mean/std-dev of the combined dataset. Use the mean/std-dev of the combined dataset to also normalize the test data and the data you receive during the production run.
Yes, I am planning to do it like this. I couldn’t find much material regarding this subject
Which Machine Learning courses have you attended? Normalization methods are covered in many of them.
I just couldn’t find what is the best practice to “merge” two or more image/video datasets with different data distributions as I stated in the question. If you know any course or material that specifically and objectively explains this concept I would appreciate that. Of course I already know about and tested the normalization methods we mentioned here but they are not satisfactory or applicable as I explained in my previous comments.
It’s not really feasible to create a model that will give optimum results on two different datasets that have fundamentally different statistical distributions.