Excuse my primitive question:
can you help me understand why the df used here is the full df not the train_df, is it a mistake?
and what is the meaning of the seed and batch size.
aren’t we supposed to feed the model individual images
" Why can’t we use the same generator as for the training data?
Look back at the generator we wrote for the training data.
- It normalizes each image per batch , meaning that it uses batch statistics.
- We should not do this with the test and validation data, since in a real life scenario we don’t process incoming images a batch at a time (we process one image at a time).
- Knowing the average per batch of test data would effectively give our model an advantage.
- The model should not have any information about the test data."
I got the previous lines quoted from the first assignment because I am unable to understand them.
Can you please clarify?
Hi @Youstina.Ghoris!
The df in the argument is not a mistake, since it is the name of the argument we can give any name to it. The important thing is that we pass the train_df as argument to the function while using it.
seed is set to 1 because we shuffle the data for every epoch. The shuffle is random, so to make sure that the shuffle is same everytime you run, seed is set. It also helps for the grading as the output will be same for everyone.
Batch size is used since we are using batch gradient descent. It is faster than stochastic gradient descent (individual images). Check online about batch gradient descent it you are new to this. It process batch of images at a time and updates the weights.
clarification about the info they have given
As I have said before in training we used batch of images, but in real life we just use 1 image for processing. so we are using seperate generator.
It is a new data, so model should not have any idea about our data. So there is no need for batch statistics.
1 Like
Hello @bharathikannan Many many thanks for your comprehensive reply, and thank you for introducing me to the batch Gradient Descent, indeed this is my first time to hear about it, will going to search about it.
Have a great day.
1 Like
You’re welcome @Youstina.Ghoris
Have a nice day. Continue learning!!
Hi @bharathikannan . What are your thought on this statement: “We should not do this with the test and validation data, since in a real life scenario we don’t process incoming images a batch at a time (we process one image at a time).Knowing the average per batch of test data would effectively give our model an advantage.” I think calculating the mean and std of the whole dataset and normalising the dataset of train, test and validation will generate a more realistic model
2 Likes
In real life we don’t always get batch of images for processing. Like while in the production we always get a single image from the patient. So here there is no need for batch processing of test set.
And even if we get test data in batches our model should not know the batch statistics as it will give our model an advantage. Like how our new data will be.
And @sbansal793, as per your suggestion about calculating the mean and standard deviation of the whole dataset, you are correct. This is what they are saying in the course.
However it is expensive. So in the assignment they have taken mean and std for a small set of our training data.
1 Like
Thank you @sbansal793 and @bharathikannan for the great discussion, I really have learnt a lot.
1 Like